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A Test of the Hotelling Valuation Principle 


Merton H. Miller 

University oj Chicago 

Charles W. Upton 

Rutgers University 


Time-series rests of the Hotelling r-perrent rule for natural resource 
prices have not been strongly supportive, but the tests and the data 
are subject to serious difficulties. We propose here an alternative 
testing strategy based on another but less widely known implication 
of the Hotelling model. We test this implication, which we call the 
Hotelling Valuation Principle, by regressing the market values of the 
reserves of a sample of U.S. domestic oil- and gas-producing com¬ 
panies on their estimated Hotelling values. We hnd that the es¬ 
timated Hotelling values account for a significant portion of the 
observed variations in market values and that the Hotelling mea¬ 
sures are better indicators of the market values of petroleum proper¬ 
ties than two widely cited publicly available alternative appraisals. 


I. Introduction 

The proposition that the unit price of an exhaustible natural re¬ 
source, less the marginal cost of extracting it, will tend to rise over 
time at a rate equal to the return on comparable capital assets has 
come to be called the Hotelling Principle in recognition of the econo- 


We acknowledge with thanks the comments and suggestions of Craig Ansley. Robert 
Barm, Nai-fu Chen, Geoffrey Heal, Richard Leftwich, Steven Manaster. Ierry Marsh, 
Harry Roberts, Rodney Smith, Hal Varian, William Wecker. Mark Wolfson, and the 
participants in the seminars where earlier versions of the paper were presented. Pierre 
Coureil and Roger Lustig provided valuable computational assistance 
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cash-flow caj>ital budgeting methods for valuing mineral proper¬ 
ties 

II. The Hotelling Valuation Principle and Its 
Testable Implications 

Although Hotelling's original model has been extended in a variety of 
ways in the more than half-century since its publication, the essential 
economic point of his analysis can be conveyed in a simple discrete¬ 
time certainty framewoik. 


A A Simple Restatement of the Hotelling Pricing Principle 


Considet, for example, the problem confronting a profit-maximizing, 
pi ice-taking owner of an exhaustible resource at time zero. The re- 
setves may be extracted either in the current or in any of the next N 
periods. Let extraction costs at time t be C, = C,(q t , Q,), where q, is the 
cm rent rate of extraction and Qj — </>' s ( ^ ie cumulative level of 

extinction. As to the properties of C„ we know that dC t !dq, must be 
positive, since total extraction costs in period t will rise with the 
amount extracted; dC/dQ, is nonnegative and will be positive if addi¬ 
tional icserves are increasingly expensive to extract. The discounted 
present value of profits is then 


Vn - X 


V M< _ Q/) 

h <1 + r)‘ ' 


where p, is the exogenously given market price of output at time t 
(assumed, in this simplest case, to be known with certainly to the 
piotlucer), r is the rate of interest (again assumed known and constant 
over time), and At is a known date beyond which production can safely 
lie presumed to have ceased. Vo is maximized subject to the constraint 


X ft s *0. (2) 

/-() 

where R () are total reserves. Expression (2) is written as an inequality 
to remind us that i eserves are in principle an economic quantity, not a 
technological datum, and that it need not be optimal to extract all of 
the reserves. Assuming that it is, and thus concerning ourselves only 
with interior solutions, the first-order condition for profit maximiza¬ 
tion in any period is 


J I \' d(' w I 


= X. t = 0. N, (3) 
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where c, = dCjdq, is marginal extraction cost in period t and \ is the 
Lagrangian multiplier on the constraint (2). s 

To simplify further, consider the special case of a production func¬ 
tion with extraction costs per unit of output independent of cumula¬ 
tive output so that dCJdQ = 0. Then the first-order condition (3) 
reduces to 

(pi ~ o) (-ttt) = X - (3') 

In an optimal production program under the assumed cost structure, 
the net present value of the net price per unit of output must be the 
same regardless of when it is produced. Solving the system of differ¬ 
ence equations (3') we obtain the familiar Hotelling Principle: 

(pt ' c,) = (p () - c n )( 1 + r)', 1 = 0. N. (4) 

That is, the efficient intertemporal production of an exhaustible re¬ 
source implies that the real price of the resource, net of marginal 
extraction costs, grows over time at a rate equal to the real rate of 
interest. 

B. The Hotelling Valuation Principle 

With the further assumption of constant returns to scale in current as 
well as cumulative extraction, the valuation expression (!) takes an 
extremely simple form. Marginal cost is then the same as average cost 
so that substitution of (4) into (1) yields as the present value of total 
reserves 

N 

k'o = (po - O)) ^ q> — (po — co)R{)- (5) 

(-0 

In words, in a world where output prices, net of extraction costs, obey 
the Hotelling Principle, the value of the total reserves in any mineral 
property depends solely on the current spot price per unit of the 
mineral, net of current extraction costs. True, units of production 
deferred to future years will earn higher net prices. But in a Hotel¬ 
ling world, the present value of the net price of any unit must be the 
same, regardless of when extracted. The growth of the net price in 
the numerators of the terms in the valuation summation (1) will be 
exactly offset by the discount factors in the denominators. 

Expression (5) is a special case of what we call the Hotelling Valua- 

' It (2) is not binding, we redefine fi ( , to be the reserves that will be recovered, thus 
ensuring an equality constraint. 
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non Principle to emphasize that it is distinct from, but still an implica¬ 
tion ol, the Hotelling Principle proper. 4 The implication can be 
tested, in principle, by regressing observed market values per unit of 
reserves for the mineral properties of particular companies at a given 
(joint in time (not necessarily the same for all the included companies) 
on the then current output prices net of marginal extraction costs, as 
in 


-7jTT = a + - cj'), (6) 

where t indexes companies, t indexes calendar time, and 0 signifies 
the then current values as of the sample date t? The test of the 
Hotelling Valuation Principle then hinges on the values of the co¬ 
efficients. Under the constant returns assumption, for example, the 
Hotelling Valuation Principle implies a = 0 and (5 = 1 and implies as 
well that additional variables such as interest rates or projected future 
mineral prices should contribute nothing to the explanatory power of 
(6). 

I he constant returns case, though perhaps useful as a first approxi¬ 
mation, is extremely restrictive and unnecessarily so. Before turning 
to the data, therefore, we extend the analysis underlying (6) to allow 
for nonconstant returns to scale in current and cumulative produc¬ 
tion and lor the use of average rather than marginal extraction costs 
in die estimating equation. We go on to consider briefly some of the 
consequences for valuation of cartel or government-imposed price 
controls as well as of uncertainty about the course of future product 
prices, finally, we note some complications traceable to the treatment 
of mineral pioperties under the U.S. Internal Revenue Code. 

C. Extensions of the Valuation Principle 

1. Nonconstant Returns to Scale in Production 

Diminishing returns to scale in current production will affect only the 
constant term in equation (6), not the slope. To see why, return to our 

‘ Although the Valuation Principle is a fairly direct corollary of the Pricing Principle, 
u asi tcctived surprisingly little aticntion in the literature. Two recent studies invoking 
t le Valuation Principle, though somewhat indirectly and not by that name, ate Brown 
and Field (1078) and Hartwick (1978). 

* ( Kc P n ‘. c!> in nul empirical tests will be well-head prices. Because of differ¬ 

ences ill transportation costs and local severance taxes, these prices will in general differ 
rotn company to company at the same point in time even under otherwise strictly 
tompeonve conditions Note also that while theoretical vahiat.on expressions such as 
, , e , n a , M< , on occasion continue to be, expressed for convenience in terms 
of total values, their likely heteroscedasticily makes them less suitable for the empirical 
tests than per unit equations like (6) 
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general cost function C = C(q,, Q,) and assume that both dCJtiq, and 
cPCJdfi are positive. To focus sharply on the distinction between mar¬ 
ginal and average cost, continue to ignore the cumulative cost terms 
and assume dCJdQ, = 0. To simplify the derivation further, let F, = 
eft — c<q, = C t — t/q, be the difference between average and marginal 
cost multiplied by output. The valuation equation (1) then becomes 

v « - J> - M-rTri ~ t > 4rb)' 

Substituting successively from the first-order condition (3') for each 
term (p, — c,)/(l + r)' in (7), we obtain 

N , 

Vo = (po - Co)Ro ~ X F > (tTt) • (8) 

Since costs are an increasing function of output in any period (i.e., 
d 2 C,ld 2 q, > 0), F, is negative and the second term on the right-hand 
side of (8) will be positive. Note, however, that the c 0 in the first term is 
the marginal cost of extraction at output level q n . The accounting data 
on which our empirical tests rely can give us only average extraction 
costs c,. With this change of variable, (8) can be rewritten in per unit 
form as 


V 0 

«o 


(po — Co) + A.i. 


(9) 


where 



The sign of K t is ambiguous since the first component is negative 
and the second positive. The intercept term a in the proposed regres¬ 
sion (6) can thus no longer be presumed to be zero under the Hotel¬ 
ling Valuation Principle. But any departure is likely to be small, since 
the two offsetting components of K ] are of the same order of mag¬ 
nitude. 6 The slope coefficient p, on the other hand, remains unity 
provided only that the Hotelling variables (ptf - <<,') can be presumed 
independent of the omitted terms Kj impounded in the intercept. 
Nothing guarantees this independence in any particular sample, of 
course. But the steps leading to (9) do show at least that for firms 
following the Hotelling Principle in allocating production over lime. 


6 The two components would offset exactly in the special case in which (1) production 
was the same in every year and (2) F, grew over time at the rate r. Under these 
conditions, each component would reduce to the same value, (N + \)FJRo. 
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none of the components of A', directly involves ( p fl - r 0 )orany future 
net price likely to be correlated with it. 


2. Fxtraction Costs as a Function of Cumulative 
Produt non 


Although tonsiderable simplification is achieved by assuming away 
the cumulative cost terms ilCJtiQ, in the first-order conditions (8), the 
stmplific ation runs counter to the fundamental economic notion, trac¬ 
ing hac k at least to Ricardo, that the lowest-cost units will tend to be 
produced first. Hotelling himself took at least some partial steps to- 
waiel bunging cumulative production into the statement of the 
model, and subsequent writers have furthered the task (see, e.g., Gor¬ 
don 1967, Solow and Wan 1976; Hartwick 1978). 

Allowing for extraction costs that rise will) cumulative production 
has been shown to imply a time path of net prices rising at a rate 
below the real rate > of the simple case of the Hotelling Principle. File 
time path of net prices reflects the opportunity cost of deferring 
production—forgone interest less the saving in future production 
costs. 

If net prices ate using at a rate slower than the t of the Hotelling 
Principle, then values should presumably be less than implied by the 
Hotelling Valuation Principle, which, it will he recalled, exploits the 
pi open v th,it the numerators and denominators of the elements in 
the present value expressions will advance in parallel in a simple 
Hotelling world fortunately, as m the case of decreasing returns to 
current piodtution, the effects of the user cost terms can he im¬ 
pounded m the intercept of the regtession, leaving the presumed 
slope eoef lie ient still unity. 

I o see why, return again to the general cost function C(e/„ Q,). V„ is 
found by substituting successively from the first-order conditions (8) 
into the restated valuation formula (7) to obtain 



(10) 


equation in (!1) we have 


But fiom the first 
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Substituting for \ in (10) and rearranging yields 
v " “ lp " ~ c,,m " " 


( 12 ) 


The second term on the right-hand side of (12) is negative, while the 
third term is positive. However, when we substitute average extrac¬ 
tion costs for marginal costs (as in [9]), (12) becomes, in per unit form, 

~~ = (/><> - Co) + Ki + K 2 , (IS) 

Wo 

where K t is the constant term from (9) and K 2 £ 0 is the second term 
from the right-hand side of (12) divided by R t) . Although the sign of 
the constant term K i + K 2 is ambiguous, it seems not unlikely—given 
our earlier comments about K t —that the constant term in (13) and 
hence the implied intercept in (6) will be negative. Note again also 
that whatever the sign, none of the components of either or 
contains (p„ - r„) or any future net price likely to be correlated with it. 
Hence the coefficient of unity for (po — r„) in (12) can be presumed to 
carry through to regression (6) for this extension as well. 


3. Noncompetitive Output Prices 

Hotelling recognized that some firms might be too large, relative to 
total industry output, for the effects of their production decisions on 
market prices to be ignored. He extended his analysis ol intertem¬ 
poral allocations to the case of pure monopoly and showed that the 
Pricing Principle carried through in a natural way, but with marginal 
revenue replacing price. The Valuation Principle extends as well and, 
in particular, the predicted slope coefficient remains unity when the 
independent variable is current price (average revenue) minus cur¬ 
rent average extraction costs. I he difference between price and mar¬ 
ginal revenue is impounded in the intercept in essentially the same 
way as the difference between average cost and marginal cost. The 
details, however, need not detain us further here, since the firms in 
our empirical sample of domestic, nonintegrated petroleum produc¬ 
ers have negligible market power. 

To say that our sample firms are price takers is not to suggest, of 
course, that the prices they faced were competitive. During our sam¬ 
ple period, which runs from 1979 to August 1981, OPEC was in its 
heyday in the world market and the federal government was impos- 
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mg price controls on domestically produced oil and gas. Such non¬ 
competitive restrictions on the path of output prices, however, invali¬ 
date neither the Hotelling Principle, properly understood, nor our 
proposed tests of its corollary, the Hotelling Valuation Principle. The 
Hotelling Principle is a proposition about optimal production man¬ 
agement by individual firms and not, except indirectly, industry 
pric es. An optimizing firm will adjust its path of outputs, and hence of 
marginal extraction costs, until its own path of net prices meets the 
Hotelling condition, whatever the assumed path for industry market 
[trices. The analysis leading to the tests for the valuation of price 
takers in a Hotelling world will thus go through exactly as before 
despite the actions of the price makers or price controllers. Price 
controls of the U.S. variety may even increase the efficiency of testing 
by introducing additional sources of variation in well-head prices such 
as lietween "old” oil and "new" oil and the many gradations in be¬ 
tween. 7 


•1. Allowing for Uncertainty 

I he consequences for our test equation of weakening the assumption 
of certainty depend to a considerable extent on the assumptions about 
extrac tion costs. For the constant-returns-to-scale case, the value per 
unit of the reserves remains the c urrent spot price, net of extraction 
com, since the Hotelling Valuation Principle under those cost condi¬ 
tions really says little more than that the value of the reserves is 
independent of where they happen to be stored, above the ground or 
below the ground. No rational buyer would pay more for the reserves 
and no rational owner would take less. F'or that special cost case, 
moreover, direct analogues to the Hotelling price growth Principle 
will hold. Pitulyck (1980) has shown, for example, that where the 
untei tdimies characterizing market demands and reserve levels can 
be adequately represented as diffusions (Ito processes), the expected 
tate of change of the net price will equal the opportunity cost of 
(nails, exactly as in the deterministic case. Sundaresan (in press) has 
shown that the same conclusion holds even when the process govern¬ 
ing new teserves is a jump process (i.e., one admitting the possibility 
of occasional major new (discoveries).** 


Remember in this connection that unless a group of fields is operated as a single 
uni nothing m the Hotelling analysis requires net prices or marginal extraction costs to 
lie equal across fields (see Gordon 1967). 

Pindyck s analysis ol die price path assumes risk-neutral producers; Sundaresan 
considers Ixilh risk neutrality and constant relative risk aversion. For the valuation 
issues that arc our main concern here, the risk adjustments are not of great moment 
because they enter both the price growth terms in the numerators and the offsetting 
discount factors in the denominators of the terms in the value summation. 
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Even under certainty, the simple r-percent rule was seen not to hold 
for nonconstant returns to scale in extraction. Uncertainty per se may 
cause additional departures from the simple rule under those condi¬ 
tions. Pindyck (1980) has shown, for example, that for nonlinear pro¬ 
duction-cost functions uncertainty about aggregate future reserves 
implies a path for net prices that rises more steeply than r. If so, 
valuations computed solely from current spot prices will tend to 
understate true market values. 

Uncertainty can also affect valuation through the optionlike ele¬ 
ments, noted in Brennan and Schwartz (1983), that arise from sto¬ 
chastic fluctuations in output prices. The owners of the property have 
the right to close the property down if current spot prices fail to cover 
current extraction costs; and more important, they can exercise the 
option of starting it up again when and if profit prospects unexpec¬ 
tedly become more favorable. Fortunately, however, as the numerical 
simulations of Brennan and Schwartz suggest, these options bulk 
large only for properties not currently producing. For companies 
already in production, such as those in our sample, the option to close 
is essentially a deep out-of-the-money put and the departure from the 
strict Hotelling Valuation Principle is small enough to be safely ne¬ 
glected. 9 

5. The Effect of Income Taxes 

Up to this point we have entirely ignored the effect of corporate 
income taxes on the production and valuation of exhaustible re¬ 
sources. We are at least in good company in that respect; in the vast 
theoretical literature on the Hotelling Principle surveyed by Devara- 
jan and Fisher (1981) there are rarely even passing references to 
income taxes, let alone detailed analyses of how such taxes affect the 
r-percent rule. The public finance literature, naturally enough, has 
shown greater concern with the possible nonneutralities of the U.S. 
tax treatment of mineral properties, but even here, interest seems to 
have waned after the ending of percentage depletion for corporations 
in the early 1970s. (For a recent brief survey of the public finance 
literature see Gaudet [1977] and the references there cited.) 

Although neglect of this complication is certainly understandable 
when the focus is on basic theoretical issues, the requirements change 
when the concern is, as here, with empirical testing of actual against 
theoretical valuations. Since the real-world-observed valuations can 

9 As a precautionary check, we computed Hotelling values under the same assump¬ 
tions as used by Brennan and Schwartz (1983) in the numerical approximations lo their 
valuation formula in their table 1. For firms in production the values under the two 
approaches were virtually identical. 
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be presumed to take income taxes into account, some assumption 
must be made as to how such taxes enter the predicted values. 

j, 1S tempting to follow the practice common in other settings of 
merely reinterpreting all the variables as after-tax magnitudes. Thus 
the Hotelling Pricing Principle would read “net prices, after taxes, 
grow at the after-tax interest rate”; and the Hotelling Valuation Prin¬ 
ciple would imply a value for 0 in our estimating equation (6), not of 
unity, but of unity minus the tax rate. 

Such a solution, however, would give only a lower bound for the 
value of reserves because it neglects the many of fsets to income taxes 
available to mineral producers under the U.S. Internal Revenue 
Code. One major offset is the depletion allowance. For our sample 
corpoiations during our sample period, the depletion deduction al¬ 
lowed jier unit of production was so-called cost depletion defined as 
“the adjusted basis of the property” (essentially its original cost less 
accumulated previous depletion deductions) divided by estimated to¬ 
tal reserves remaining. The present value of the future tax shields 
from such cost depletion will add a positive term to the intercept in 
the valuation equation (6). 

But this is only the beginning of the story. Suppose the firm finds 
hitherto unsuspected (or at least unreported) reserves on its property; 
or suppose its output prices unexpectedly surge upward. Then its 
projected corporate income lax payments will also surge. The law, 
however, offers a simple remedy: sell out! The property is worth 
more to a buyer than to the seller Itecause the buyer can step up the 
basis. The buyer’s basis for depletion will be not the low original cost 
basis of the seller, hut the higher price actually paid for the property. 
Systematic reshuffling of properties of this kind can have substantial 
impact on the effective corporate income lax rate and hence on the 
implied slope coefficient. For example, with a 10 percent discount 
rate, production declining exponentially at 20 percent per year over 
an assumed 20-year economic life, and resale with stepped-up basis at 
the Hotelling value every 5 years, a statutory corporate income tax 
rate of 40 percent falls to an ef fective rate of only a bit more than 10 
percent. 

This incentive to keep selling out old properties is dampened by 
transactions costs and, in principle, also by the tax on the seller’s gain. 
But at most, that tax is at the lower corporate capital gains rate; and 
during our sample period, even that lower rate tax could have been 
avoided, without triggering other recapture provisions, by such de¬ 
vices as tax-free partial liquidation of the property or by spinning off 
the low basis property to shareholder as a royalty trust. 10 


10 Foi an account ol ihc potential lax savings in properly handled sales and liquida¬ 
tions of mineral properties see the account in Brown (1982) of the lax angles in the U.S. 
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As an alternative to selling off low-basis properties, many firms in 
our sample chose to offset the taxable income against the expenses of 
drilling new wells. The permitted 1-year write-off of so-called intan¬ 
gible drilling costs—currently typically from two-thirds to three- 
fourths of the cost of digging the well—not only reduces the effective 
rate of corporate tax on the successful properties but creates strong 
incentives to keep plowing earnings back into the firm, thereby also 
reducing the tax bite on the owners under the personal income tax. 

On balance, then, we conclude that taxes push the implied slope 
coefficient in the Hotelling valuation equation below unity, but proba¬ 
bly not substantially so given the tax benefits, intended and unin¬ 
tended, to oil and gas producers during our sample period. 

Not to be able to pin down the slope coefficient more precisely than 
“one or somewhat less” is disappointing. But the model has other 
testable restrictions (e.g., that the coefficient on the extraction cost 
variable be the negative of that on output prices), and other predic¬ 
tors of value exist against which to judge the Hotelling valuations. 


III. The Empirical Tests of the Hotelling 
Valuation Principle 

Previous tests of the Hotelling Principle have for the most part been 
time-series tests of the price-path predictions. Our tests, by contrast, 
are pooled, cross-sectional tests of the relation between observed mar¬ 
ket values for oil- and gas-producing properties and the values im¬ 
plied by the corollary to the Hotelling Principle that we have dubbed 
the Hotelling Valuation Principle. 


A. Sources of the Data 

For many natural resources it is difficult if not impossible to obtain 
reliable estimates either of recoverable reserves, R, or of the current 
price net of extraction costs, p n - c<>. In the case of oil and gas proper- 


Steel-Marathon merger. During our sample period distribution of corporate reserves 
m a royalty trust occasioned no taxable gain or loss to the corporation, but the fair 
market value of the trust interests received were treated as dividends by the sharehold¬ 
ers. If larger than the firm’s accumulated retained earnings, the dividend was a nontax- 
able return of capital under the personal income lax. When not a return of capital, 
taxable individual shareholders could sell their shares cum the trust dividend (i.e.. after 
the dividend is declared, but before the stock goes ex dividend), thereby converting the 
dividend to a capital gain. The cum dividend shares were attractive to taxable domestic 
corporations, which could exclude 85 percent of the dividends received and then sell 
the shares ex dividend, claiming the cunt-ex differential as a capital loss. The net 
negative tax on a corporation could run over 20 percent, though some of the benefits 
were presumably shared with the original selling shareholders. Many of these loopholes 
have since been closed, particularly tightly, by the Tax Reform Act of 1984. 
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ties, however, publicly held companies, since 1978, have been re¬ 
quired by the SEC, pursuant to Regulation F-X, rule 4-10, to publish 
annually estimates of "proved reserves,” economically recoverable at 
current prices, and of actual nei prices received for production dur¬ 
ing the month prior to the date of the report. Additional details on 
prices, costs, production, and reserves have become available as a by¬ 
product of federal energy legislation during the middle and late 
1970s. 

The specific estimates of reserves, sale prices, and operating costs 
we use are those presented in the periodical Oil Industry Comparative 
Appnuvils, published by John S. Herold, Inc. The Herold compila¬ 
tions aie readily available and widely used references based on public 
sources such as the SEC-mandated company statements and reports 
to state and federal regulatory agencies. The Herold figures for oper¬ 
ating costs also allow for the estimated excise taxes payable after 
March 1, 1980, under the Crude Oil Windfall Profit Tax Act of 1980. 

Heroic! lists some 60 firms as U.S. oil- and gas-producing com¬ 
panies (as opposed to producing and/or refining companies), but we 
have had to pare that number back to 39, eliminating mainly firms 
whose oil and gas reserves were too small in absolute size or relative to 
(heir non petroleum activities to get a reliable fix on their values. 11 For 
each of the remaining companies, stock price data were obtained 
f tom the Hank and Quotation Record for the dates corresponding to the 
dares of the Herold estimates of reserves, usually, but not always, the 
date of the closing of the company’s fiscal year. For 19 of the com¬ 
panies, there were three reserve evaluation dates during our sample 
period from December 1979 to August 1981, and for 17, there were 
two, making 94 sample observations in all. 12 


H Definitions of the Variables 

Recall that the variables in our basic test equation (6) are V, the total 
market value of the property, R, the total recoverable reserves, and 
ith) ~ rii), flte current net price per unit of reserves. 

All the companies in our sample own both oil and gas reserves. 
Since the Hotelling Valuation Principle presumably applies with 
equal force to each, we begin by pooling the two types into a single 
composite aggregate reserve of R barrels of oil or oil equivalents. The 
conversion factor is the conventional one in the industry (corre- 


Tw< “ °‘ the f lrn,s d ™PP ed appear to have been mainly coal companies; three had 

comrart P driJfer ^ e n Slg " ,hlanl Undl *' ,osed re5erves ° r uranium; one was essentially a 
'7- d ' appears to have been a subsidiary of a company already in the 
™ p i e h rhe renld ' ndcr w *-re firms with less than two million barrels of oil. 

The names of the sample firms are available from the authors. 
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sponding to approximate Btu equivalence) of 5,700 cubic feet of gas 
for each barrel of oil. The net price (po - c 0 ) then becomes the 
weighted average of the values for oil and gas defined as 

pa - c 0 = ( R ]f' |(A»t " c oti) + ( R r* yp*" ~ < 14 ) 

where R = aggregate current reserves in barrels, = oil reserves in 
barrels, /f ga , = gas reserves in oil-equivalent barrels, and (p ol | — c ()i |) 
and (/? K as — c gas ) are the net prices for each barrel of oil and oil- 
equivalent gas. 

For V, the value of the reserves, the ideal measure would be actual 
transaction prices for fields. Not enough publicly recorded cases are 
yet available, however, to permit formal statistical testing. Absent 
transaction prices for the petroleum reserves on the asset side, we 
shall instead follow the standard practice in finance of substituting the 
values of the claims on the liability side—a practice that for our sam¬ 
ple industry has come to be called prospecting for oil on the floor of 
the New York Stock Exchange. 

Were it a matter only of measuring the value of the equity claims, 
the calculations would be straightforward (price per share times num¬ 
ber of shares outstanding). But virtually all companies in our sample 
had creditor claims outstanding, and valuing these is not so simple. 
Typically, debts were of three types: short-term obligations (mostly 
trade accounts payable); bank loans (normally at floating rates); and 
long-term, fixed-rate bonds and notes. We took the first two at book 
value. For long-term debt, we used actual market values where pub¬ 
licly traded; and where not, we used estimates based on prices for 
publicly traded issues of comparable rating and maturity. 

Further complications arise because most companies own assets 
other than the oil and gas reserves of our primary concern and these 
other assets must be netted out. For some nonpetroleum assets such 
as investments in the securities of other firms, market prices could be 
obtained. The others, perforce, were taken at book value. That ap¬ 
proximation is unlikely to be far off for assets such as cash or accounts 
receivable, but is more problematic for assets like plant, equipment, 
tank trucks, inventories, and in a few cases “intangibles.” 

C. The Basic Regression 

Line 1 of table 1 shows the results of fitting the regression equation (6) 
to our sample of producing companies. The average net current price 
(denoted as HOTEL) has a slope coefficient of 0.910 with a standard 
error of 0.114 and is thus in close agreement with the predicted “one, 
or a bit less” of the tax-adjusted Hotelling Valuation Principle. The 
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negative intercept of —2.240 with a standard error of 1.035 is consis¬ 
tent with the predictions derived in the previous section for the Hotel¬ 
ling Valuation Principle under conditions of decreasing returns to 
scale in current and cumulative production. 13 

Not only are the regression coefficients in accord with prior expec¬ 
tations, but the net price variable HOTEL accounts for a substantial 
proportion of the variation {R 2 = .408) of V/R (denoted hereafter as 
VALUE). A glance at the scatter plot in figure 1 shows the relation to 
be a pervasive one and not attributable merely to a few outliers. Nor is 
either the strength of the relation or the near-unity value of the slope 
coefficient merely an artifact of the correlation of errors in the total 
reserves, R, that appears in the denominators of both VALUE and 
HO'l’EL. Row 2 of table 1 shows that adding ]/R as a separate variable 
leaves both intercept and the slope coefficient of HOTEL virtually 
unchanged. The term l/R itself appears to have little marginal ex¬ 
planatory power in the presence of the key variable HOTEL. M 

1). The Basic Regression Further Disaggregated, 

Additional tests of the restrictions implied by the Hotelling Valuation 
Principle, as well as checks on the validity of the pooling assumptions 
underlying the basic regression, are presented in rows 3-9 of table 1. 
Rows 3, 6, and 7 show the effect of separating HOTEL into its two 
components, OIL and GAS. Note from rows 6 and 7 the dramatic 
drop in explanatory power when the components are entered singly. 
Clearly market values reflect, as they should, both kinds of reserves. 
As to the relative weights on each, the Hotelling Valuation Principle 
presumably applies equally to reserves of both types, and hence the 
coef ficients should be the same. As it turns out, the coefficient of GAS 
is somewhat higher than that of OIL but not by more than could be 
accounted for by normal sampling fluctuations. A formal /•'-test 
confirms that the restriction of equal coefficients on OIL and GAS 
cannot be rejected at conventional levels of significance. 

Disaggregation of HOTEL into its output price and extraction cost 
components provides another, and in some respects even more strin- 


* Since our R measures "proved” reserves rather than "proved and developed" 
reserves, the negative intercept may also reflect the market's anticipation of future 
development costs. 

1 * We also reran the basic regression in ratio form (i e., regressing VALUE/HOTEL 
on I/HOTEL and a constant) partly as a lurlher check on any problems caused by a 
common R and partly as a simple correction for the slight heteroscedasuciiy in the basic 
regression detected in a Park-Glejser lest. The standard errors dropped somewhat, as 
would be expected, but the coefficients themselves were virtually unchanged. The 
results were also essential!) the same when the regression was run in terms of total 
value rather than on a per barrel basis. 
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I k.. I •- Plot ot VALUE vcisus HOT LI.. Variables measured in dollars per barrel of 


gent, test of the Hotelling Valuation Principle. The theory says that 
both terms are required to explain value and that they enter the 
regression with equal magnitude but opposite sign. Note then once 
again the substantial drop in explanatory power when each compo¬ 
nent is entered singly—for example, in regressions (9) and (10) as 
compared to regressions (1) and (4) where both are present. (These 
t esults should also serve as a useful reminder of the dangers of focus¬ 
ing solely on product prices to the neglect of extraction costs—a fail¬ 
ing, we suspect, that is at least partly responsible for the popular 
journalistic belief that the stock market value of oil is always cheap 
relative to market prices and exploration costs. ,B Extraction costs 

” Note also the negative intercept in the valuation equation. Even after allowing for 
current production costs, the current price gives only an upper bound to the value per 
barrel because user costs (and, in our sample, future development costs) must be taken 
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TABLE 2 

Tests or HOTEL vntscs Time 


Regression 

Constant 

HOTEL 

DATEST 

it 7 

<T. 

(1) 

2.886 


.0091 

.289 

3.495 

(.572) 


(.0015) 



m 

- 1.645 

.750 

.0026 

.420 

3.174 

(1.122) 

(.165) 

( 0026) 




Note. —Dependent variable* maeket value of cm I and garfunrcl cqtuv^enc of od §p» rnervrv Independent 
variables defined m urxt N ™ 94 Standard errors in parentheses 


must be included along with prices to account for differences in the 
market value of reserves.) And, more to our main concern, the results 
in regressions (4) and (5) are consistent, up to sampling error, with 
the prediction that the coefficients of prices and costs are equal but of 
opposite sign. 

E. A Check for Common Time Trends 

The seeming explanatory power of our HOTEL variable may per¬ 
haps be reflecting only the common upward trend in both stock prices 
and petroleum prices during our sample period of December 1979- 
August 1981. We have already seen some indirect evidence that such 
is not the case: regression (9) of table 1 has lower explanatory power 
than regression (1). But a more direct test is provided in table 2, which 
adds to the basic regression an additional variable, DATEST, repre¬ 
senting the date for which the reserves and market values were es¬ 
timated (December 31, 1978 = I, etc.). As can be seen from the first 
regression in table 2, market value is indeed positively correlated with 
DA LEST. But the explanatory power of the thneArerad variable 
DATEST is substantially less than that of HOTEL alone; and both 
the regression coefficient of HOTEL and the correlation coefficient 
of the regression remain essentially the same whether or not the time- 
trend variable is included. 16 

F. Correcting for Serial Dependence of Individual 

Company Residuals 

In a sample such as ours, with multiple observation daces for each 
company, the standard regression assumption of independent resid¬ 
ent as well. Recall also the earlier discussion in Sec. IIC5 above of the tax benefits from 
sales of reserves and from drilling. 

As an additional check on the validity of the pooling of observations at different 
times we reran the basic regression separately for each half of the sample period. We 
coaid not reject the restriction of equal slopes and intercepts in the two subperiods. 
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ual.s cannot be taken for granted. Indeed, if capital markets are 
efficient—and we have, of course, been relying on that efficiency 
implicitly in our use of security prices as the relevant measure of the 
value of reserves—the individual company residuals from our basic 
regression cannot be serially independent, fo see why, suppose that 
e |<+1 , the residual difference between company fs market value at 
time t + I and its regression prediction a + (3HOrF.L,., + lt were 
independent of tfie previous residual e„. Then for any firm with a 
positive residual at I, the expected change in residual is negative, im¬ 
plying an expected fall in value relative to the regression line; and for 
a firm with a negative residual, the expected change in residual and in 
value relative to the line is positive. Thus positive expected profits 
could be earned on a zero-investment portfolio formed at f, long in 
the shares of the “undervalued” stocks with negative e,., and short in 
die “overvalued" stocks with positive e,,. 

To guard against any mistaken inferences traceable to this autocor¬ 
relation of individual firm residuals, we have reestimated the basic 
equation in first-difference form with the following result: 

VALUE,,, | - VALUE,, = 0.185 + ().978(HOTEL„ + , - HOTEL,,), 

(0.794) (0.226) (15) 

R‘ = .262,a e = 8.747. 

The lelation between HOTEL and VALET found earlier thus ap¬ 
pears to persist even in the face of differencing. 17 


(• Olliei Tests uj the Specification 

I wo other tests of the robustness of the specification are perhaps also 
worth noting. As a test of the predicted linearity of the basic relation, 
we added a variable (HOTEL) 2 to the equation and found that it 
made no material contribution. As a further check on distortions 
introduced by etrors or omissions in our estimates of creditor claims 
or of other, nonreserve assets, we also reran the basic equation with 
stock market value rather than total net value as the dependent vari¬ 
able and with creditor claims and nonreserve assets on the right-hand 
side as separate independent variables. These variables came in with 
correct signs and reasonable magnitudes, suggesting at least that they 
are not hopeless proxies for what they seek to measure. And, more to 
the immediate point, the coefficient of HOTEL, was little affected. 


17 The first diffetences arc taken as absolute differences rather than percentage or 
logarithmic differences tiecause of negative values in a few rases for our variable 
VALUE (defined, it will be recalled, as the sum of equity plus debt minus the estimated 
value of other, nonreserve assets) 
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H. Summary 

For our sample of domestic petroleum-producing firms, the relation 
between the market value of reserves and current output prices net of 
extraction costs appears to conform closely to that predicted by the 
Hotelling Valuation Principle, This closeness stands up, moreover, in 
the face of a variety of standard and special diagnostic checks. Before 
awarding any palms to the Hotelling Valuation Principle, however, it 
is important to check out the alternatives. The net current price, after 
all, is likely to play a substantial role in any “discounted cash flow” 
valuation formula, though only the Hotelling Valuation Principle 
identifies it as the critical determinant of value. A more stringent test 
of the principle is thus whether it can predict more accurately than 
reasonable alternatives, and it is to these comparisons we now turn. 


IV. Comparison with Alternative Valuations 

As noted earlier, we are fortunate in having two publicly available 
valuations of the reserves for the companies in our sample. One is the 
annual statement of Estimated Present Value of Estimated Future 
Net Revenues from Oil and Gas Activities mandated since 1978 by the 
SEC pursuant to Regulation F-X, rule 4-10, for inclusion in the 
financial reports of registered oil and gas producers. The other is 
the regularly published appraisals of John S. Herold, Inc., from 
whose raw data we have constructed our own Hotelling measures. 

A. The SEC Valuations 

Under SEC regulations, estimated future net revenues are computed 
by applying, for conservatism, current oil and gas prices net of extrac¬ 
tion costs to estimated future production from net proved reserves, 
less estimated future development expenditures. The present value 
of the estimated net revenue stream is then determined by discount¬ 
ing at a prescribed rate of—what else?—10 percent. 

In using these SEC-mandated values as standards for comparison it 
is not our intention to suggest that they are or ever were used by 
anyone to appraise reserves. The SEC valuations rather should be 
thought of as analogues to the “naive models” against which more 
elaborate forecasts are calibrated. 


B. The Herold Appraisals 

I he Herold appraisals, by contrast, are intended to be taken seriously 
and are, in fact, widely cited in the financial press. The appraisals 
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themselves are representive of the “discounted-cash-flow” calcula¬ 
tions that have become standard in the literature of finance. Herold 
has, it is true, finessed the problems connected with estimating an 
appropriately risk-adjusted “cost of capital,” opting instead, like the 
SEC, for the conventional 10 percent. But unlike the SEC rules, 
Herold’s estimates of future cash flows do try to allow for what the 
firm regards as the likely future course of petroleum prices and costs. 
Herold (November 1980), for example, assumes “that the current 
average price received by each individual company will increase to a 
decontrolled world oil price of $35.00 per barrel by 1982, at which 
time the price will increase at an annual rate of 6% until the year 2000 
and remain constant thereaf ter.” Over the same horizon, “unit oper¬ 
ating costs are escalated at a rate for each individual company which is 
dependent on the number of years to exhaust the reserves and the 
unit price of oil or gas at the time of exhaustion” (p. 706). 


C- Comfnnson of the Alternative Valuations 

fable 3 shows the correlation coefficients between the three alterna¬ 
tive valuation measures and between each and VALUE, our measure 
of the market value of reserves. Note that the three alternative valua¬ 
tions are indeed highly correlated, as would be expected from the 
important information about current prices and costs they share in 
common. Note also that even the naive SEC measure appears to ac¬ 
count for a nontrivial fraction of the variation in VALUE, though 
substantially less so than its better-grounded rivals, HOTEL and 
HEROLD. Of the three, the strongest association with VALUE is 
found for HOTEL. 

Not only is HO I EL the best single predictor of market value in our 
sample, but neither of the alternatives, despite their seemingly high 
correlations with VALUE, appears to have significant information 
about VALUE not already subsumed in HOTEL. Table 4, for ex¬ 
ample, presents the results of Davidson-MacKinnon (1981) tests of 


TABLE. 3 


(.ORKMAIION Cnu-m UNIS BETWEEN ALTERNATIVE VALUATION MEASURES* 

VALUE HOTEL 

SEC 

HEROLD 

VAl-b'E 1 () 641 

HOTEL ,' () 

sue 

HEROLD 

.547 

.751 

1.0 

.628 

.897 

.701 

1.0 

• Based on the 92 observation, for whith SK, valuations were .nailable 
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TABLE 4 

Tests oe Alternative Models oe Market Values kor Reserves* 



Alternative 

& 

la 

HOTEL 

SEC 

.265 

1.192 

HOTEL 

HEROLD 

.409 

1 420 

SEC 

HOTEL 

831 

4 395 

SEC 

HEROLD 

.765 

4.239 

HEROLD 

HOTEL 

.642 

2.284 

HEROLD 

SEC 

.384 

1 854 

* Based on the 92 observations for which SH. valuations were available 


regression specification in the presence of alternative models pur¬ 
porting to explain the same phenomenon. 

For testing a linear model 1I 0 : y, = f,(X„ (3|) + e 0 , against an alterna¬ 
tive nonnested model H t : y, = g,(Z„ Pa) + €|, (where Pi and p 2 are 
vectors of parameters to be estimated), Davidson and MacKinnon 
propose the regression 

y x = (1 ~ <*)/,(*„ Ps) + ag, + €„ (lb) 

where g, = g(Z„ P 2 ). and p 2 is the maximum likelihood estimate of p 2 . 
If Hq is true, then the true value of a is zero and the t-statistic provides 
a lest of that hypothesis. Generalizations of (16) are possible if / is 
nonlinear, but they need not concern us here. 

To implement the Davidson-MacKinnon test we first fit a relation 
between VALUE and SEC of the form 

VALUE, = 70 + t,SEC, + e, (17) 

and then use the fitted values to compute the regression estimate g. 
We then estimate an equation of the form 

VALUE, = (1 - a)(a + b HOTEL,) + a g, + q, (18) 

and test for a = 0. Similar steps are followed for testing HOTEL 
against HEROLD. For completeness, we also test SEC and HEROLD 
as the maintained hypothesis against each of the alternatives. 

Columns 3 and 4 of table 4 show the values of a and t(a) for various 
combinations of maintained and alternative hypotheses. In the first 
two rows, the Hotelling model prediction is the standard and SEC and 
HEROLD are the alternatives. The low t-vaiues for a imply that a = 0 
cannot be rejected at conventional significance levels, which is to say 
that SEC and HEROLD can be rejected as alternatives to HO TEL. On 
the other hand, when the roles are reversed, as in the subsequent 
rows 3 and 5, we cannot reject HO TEL as an alternative to either SEC 
or HEROLD. 
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The role of the Hotelling Principle or “r-percent rule" and its natural 
extensions as the central propositions of resource economics is un¬ 
likely to be much affected by “mere” empirical testing. No viable 
alternative paradigm exists. Still, economists should be comforted to 
learn, given the generally poor results to dale of direct time-series 
tests, that at least some of (he restrictions implied by the Hotelling 
model ol optimal intertemporal produc tion can indeed be detected in 
leal-world data, in paiticular, we have seen that an interesting but 
little-cited coiollaiy of the Hotelling Principle, the Hotelling Valua¬ 
tion Principle, provides not only reasonably good descriptions of the 
sttneture of actual market values of a sample of U.S. petroleum- 
producing companies, but substantially better descriptions than the 
publicly available alternative appraisals based on much the same 
underlying taw data 

I be telative robustness and predictive success of the Hotelling- 
based appraisals provide additional evidence, if that is needed, that 
more complicated models aie not always better. The Hotelling Valua¬ 
tion Principle could hardly be simpler: it says, essentially, that the 
value of ,t unit of reserves in the ground is the same as its current 
value above the ground less the inaiginal costs of extracting it. Things 
work out diis neatly because in a Hotelling world, the two key compo¬ 
nents of the valuation lot inula—the expected trend of future net 
ptocluct pi ices and the appioptiate discount rate—are not indepen¬ 
dent, but ate two sides, as it were, of essentially the same coin. Con¬ 
ventional valuation procedures that try to estimate the pieces sepa- 
tati'ly will thus ignore the important restrictions imposed by the 
theory ol optimal exploitation. 
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I In', paper develops a model of a country that can import oil in 
normal limes hut is subject to random nade disruptions. To mitigate 
the effects of these disruptions, some of the imported oil is stock¬ 
piled. There ate two kinds of uncertainty involved: the date at 
which an embargo will he imposed is not known in advance, and 
given that an embargo has been imposed, the date at which it will 
end is unceitam. Optimal behavior is analyzed bv means of dynamic 
programming 


I. Introduction 

We consider a country that is able to import in normal times a steady 
flow of a commodity at a given world market price. For brevity we will 
call this commodity “oil.” Now, the imports of oil are subject to possi¬ 
ble curtailments of deliveries from foreign countries. To hedge 
against the costs of such unexpected interruptions, some of the im¬ 
ported oil is stockpiled. In the case of an embargo, the country can 
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run down its stocks of emergency reserves and thereby mitigate the 
impact of the supply disruption. From the point of view of the im¬ 
porting country, there are two kinds of uncertainty making the prob¬ 
lem of optimal management of emergency reserves a nontrivial one. 
First, the date at which an embargo will be imposed is not known in 
advance but has to be treated as a random variable. Second, given that 
an embargo has been imposed, the date at which it will end is uncer¬ 
tain. 

Parts of the problem of trade disruptions have been analyzed by, 
for example, Nordhaus (1974); Kuenne, Blankenship, and McCoy 
(1979); and, in a different context, Arad and Hillman (1979). These 
papers are characterized by a 2-period setting; in the first period the 
country can import freely, while there might be a trade disruption 
(with probability it) in the second period. A more realistic, mul¬ 
tiperiod setup is that of Tolley and Wilman (1977), where the dura¬ 
tion of the embargo is uncertain. Our problem is somewhat similar to 
that of Tolley and Wilman, and we will discuss below in what respec t 
our analysis differs from theirs. 

Formally, our problem resembles the one in the literature dealing 
with the optimal depletion of a natural resource in the face of the 
possible future introduction of a new technology. This problem has 
been studied by, for example, Dasgupta and Heal (1974) and Das- 
gupta and Stiglitz (1981), and we will show below that some of our 
results bear a close resemblance to theirs. The two sources of uncer¬ 
tainty in the embargo problem, however, make it somewhat different. 

The paper is organized as follows, in Section II below, we set up the 
formal structure of the model. The environment of the oil-importing 
country can be characterized by either of two regimes; ‘‘free trade" 
and “embargo,” respectively. Optimal behavior can be analyzed by 
means of dynamic programming arguments. Under each of the two 
regimes the optimal policy must satisfy a particular functional equa¬ 
tion, and the overall problem involves the solution of the simultane¬ 
ous system of these two equations. It is convenient to start by analyz¬ 
ing one equation at a time: in Section III we analyze the optimal 
depletion of an oil reserve during an embargo, when the size of the 
reserve is given and when the duration of the embargo is uncertain. 
In Section IV we analyze the optimal stockpiling policy during free 
trade, when the duration of the free-trade regime is uncertain. In 
general, this involves the simultaneous solution of the two functional 
equations. Under the simplifying assumption of a perfect interna¬ 
tional capital market this problem can be easily solved and an explicit 
expression for the optimal size of the emergency reserves can be 
obtained. In Section V we derive expressions for the welfare of the 
country that is subject to embargo threats, and in Section VI, finally, 
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we conclude the paper by discussing some possible extensions of the 
model. 


II. The Model 

Assume that a country consumes two commodities—one imported, 
which we will call “oil,” and one domestically produced, perishable 
commodity, which we will call “potatoes” for brevity. We confine our 
analysis to a small country; thus the relative price of oil in terms of 
potatoes, denoted by p, is assumed to be exogenously given by a world 
market. 1 We will make the simplest possible assumptions about the 
production technology and just assume that a constant flow of 
potatoes, /., will be costlessly made available to the country at each 
instant of time. Denoting the country’s consumption of oil at time t by 
q, and its consumption of potatoes by z,, we can write the budget 
t onstraint as 


pS, + pq, = 2 - i, + 

where S, is the country’s stock of oil reserves, and where the dot 
indicates the time derivative. The variable 6 is the deficit in the bal¬ 
ance of trade. While S„ q„ and z, are nonnegative, 6 is not restricted in 
sign. A positive value of fj, indicates that the country is borrowing in 
the international capital market (or, alternatively, is a net creditor and 
receives interest income and amortizations). A negative value implies 
that the country pays interest and amortizes loans (or, alternatively, 
lends money to foreign borrowers). 

If no international capital market exists, we impose the restriction 
that 


6 = 0 for all t. (I) 

In this case the country cannot borrow money to build up its oil 
reserves but has to rely solely on its potato crop to finance the stock¬ 
pile. It, on the other hand, the country has access to a perfect interna¬ 
tional capital market, 6 has to satisfy the milder constraint 

= 0 ( 2 ) 

1 Strictly speaking, the assumption of a constant price is unacceptable in an analysis 
of the oil market. Instead, world oil reserves should be depleted according to some 
Hotelling formula The assumption of p, — p is made to facilitate the exposition; those 
who do not like it can substitute the word “wheat," e.g., for "oil” throughout the paper 
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instead. 2 In Sections II and III the analysis will be general enough to 
encompass any of the assumptions (1) and (2). In Section IV we will 
limit ourselves to the more straightforward case of a perfect credit 
market, and we will comment only briefly on the implications of the 
absence of a credit market. 

Assume further that the country’s utility is additively separable in 
time and that the marginal utility of potato consumption is constant. 
Instantaneous utility can then be written as 

U(q t , z„ t) » e~ r ‘[u(q t ) + z,], 

where r is the discount rate and where we assume that u'(q) > 0, u"(q) 
< 0, and litn^o u '(q) = co - The assumption of a constant marginal 
utility of the background commodity is made to facilitate the analysis 
and is a standard one in the literature on natural resources. s 

In the long-run perspective, we will assume that periods of free 
trade come and go, interrupted by embargoes, in an infinite se¬ 
quence/ 1 We assume the stochastic process {*,} generating the se¬ 
quence of regimes to be a two-state stationary Markov process that 
takes the values zero and one. We identify the state x t = 0 with an 
embargo and the state x, = 1 with a free-trade regime. The assump¬ 
tion of a stationary Markov process implies that the duration x 0 of an 
embargo is exponentially distributed with parameter 0 () . This means 
that the density function of the duration t„ can be written as /o(t 0 ) = 


“ The expectation is taken since there is uncertainly involved in the problem, which 
means that we cannot automatically guarantee that all debts are repaid with certainty. 
Constraint (2) thus says that all debts are repaid on the average, which means that 
lenders are assumed to be risk neutral The probability laws defining the expectations 
operator £[■) will be presented later in this section. As will also be shown later, it can 
sometimes be advantageous to adjust the stock of oil instantaneously. Thus the stock of 
oil would make a discrete jump AS, at some t = T,,- This means that if the control 
vanable tj were a function of time, it would be infinite lor ( = t„ In such a case the 
integral in (2) is not well defined. We can, however, gel around this problem by regard¬ 
ing not as a function of time but as a measurr, and by regarding the integral in (2) as a 
Lebesgue integral. The intuitive meaning of this rather abstract mathematics will be 
clear from the discussion of the steady-slate variable £* in the App. 

1 Our assumption of oil as a consumption good is not critical for the analysis. It is 
quite possible to assume instead that oil is used as an input in the production of a 
consumption good (as is the case with, e g., fuel and/or fertilizer) Since this means that 
we would have to introduce a production technology into the model, we have decided 
to economize oti concepts and notation by considering only the simplest case, t.e., that 
in which oil is used as a consumption good. 

In fact, the concept of an infinite sequence of shifting regimes is what makes our 
analysis different from the one of, e.g., Dasgupta and Heal (1974) and Dasgupta and 
Stiglitz (1*181). In these papers it is assumed that the new technology is a once-and- 
lorever event; as soon as it is introduced, it will last forever—and nothing new will 
happen Hillman and Long (1983), in their discussion of how embargo threats affect 
the optimal depletion of a natural resource, also make this once-and-for-all assump¬ 
tion; the embargo occurs at (the stochastic) date T and lasts forever thereafter. 
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0 () . e - Likewise, the duration t, of a free-trade regime is exponen¬ 
tially distributed with parameter 6 t according to /jCn) = 0 t • e ,T “. 
The parameter 0„ is the probability rate that the country will be in a 
free-trade regime in the next instant, given that there is an embargo 
going on now. Likewise, if no embargo is now in effect, 0 t denotes the 
probability rate that the economy will be in an embargo regime in the 
next instant. Hence the stochastic process {x,} is completely character¬ 
ized by the two transition probability rates 80 and 0 ). I his follows 
because the Markovian assumption implies that the likelihood of leav¬ 
ing a given state in the next instant is independent of the history of 
the process, while the stationary assumption implies that the likeli¬ 
hood, given the history, is independent of the date t. 

We can now formulate the problem of the oil-importing country. 
Assume that an optimal policy exists. Then, if no embargo is now in 
effect (i.e., if x 0 = 1), let V'(l, S) denote maximum expected net utility 
to be delived f rom optimal behavior, with an initial stock .9 in storage. 
Assume that physical storage per se is costless, though of course 
financial capital tied up in the storage forgoes the opportunity to earn 
interest. On the other hand, if an embargo is now in effect (i.e., if x 0 = 
0), let V’(<), S) denote the expected present value of welfare when 
optimal behavior is pursued from the initial inventory of 5 in storage. 
When there is an embargo going on, optimal consumption implies the 
following: 


V'((), S) = max 0 o c“ H,,T '' 
(</<l Jo 


subject to S = 


+ r ' T "V(1, ,S T(1 J 


[ e~"[u(q t ) + / + £*]<* 
Jo 

k-r,, 


(3) 


■So = .V.' 


-?<>' 


I wo things should be noted here. First, we see that since there is an 
embargo going on, the budget constraint has been split into two: S, = 
-q, and z, = Z + Second, we have set (a constant) in the 

functional equation; thus z, = Z + and the maximization is made 
over the control variable q t only. If no credit market exists, f;* = 0 by 
(1) and z, = Z. If a perfect credit market exists, however, things are 
somewhat different. Then (;* is a negative constant, which can be 
interpreted as the steady-state rate of debt service and which can be 
justified in the following way. During an embargo, there are two 
decisions to be made: how to deplete the stock S, and how to divide 
the potato crop Z between consumption z, and debt service fj<- Since 
the utility function is additively separable in q t and z„ these two deci- 
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sions are independent of each other, and therefore the optimal 
policies {z,} and {£,} do not affect the optimal depletion policy {q,}. 
Now, since the utility function is linear in z„ the country is indifferent 
between different time profiles of debt service {£,} satisfying (2). Fur¬ 
ther, lenders are also indifferent between different paths {£,} as long 
as (2) is satisfied. We therefore choose the most convenient of these 
paths, namely, the steady-state one with fj, = fj*. For the time being we 
regard as an exogenous constant; in the Appendix we will derive 
an explicit formula for it. 

Integrating by parts, (3) can be equivalently written as r ’ 

1/(0, S) ^ max P [u(q t ) + Z + £* + 0 O V(1, S,)]* - ( ' + au) 'dt (4) 
M Jo 

subject to S, = —q,; 

So = S; 

S„ q, 2* 0. 

The functional equation (4) thus describes the planning problem 
when an embargo is in effect. When there is no embargo, the problem 
looks different. Then optimal behavior implies 

V(l, S) = max [* 0,e B|T| j P e~"\u(q,) + z«l dt 

Ui.ii.ii I Jo Uo ^ 

+ e'"mS T| )U. 

subject to pS, + pq, — Z — z, + 
either (1) or (2); 

So = S; 

S„ q„ z, ^ 0; 

or equivalently (again integrating by parts): 

T(I, s) = max \u(q,) + z, + 6,V((), S,)]e '(fi) 

{<!,. U Jo 

subject to pS, + pq, = Z - z, + 
either (1) or (2); 

So — S; 

S t , q„ z, 3= 0. 

' Here we see that the assumption of an exponential probability distribution of the 
length of embargoes gives rise to a "discount factor" of the form c "' ®" H . Thus the 
solution to our planning problem (4) is dynamically consistent in the Strotr ( 1 —56) 

sense, while no other probability distribution will yield consistent depletion paths 
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Equations ('1) and (6) may he taken to represent the problem of 
optimal reserve management, and their simultaneous solution for the 
funitions V (0, .S') and V(l, S) yields the optimal inventory policy. In 
general this problem of dynamic programming with a switching of 
regimes is very difficult. Application of Bellman's “principle of opti¬ 
mality" would lead to a pair of nonlinear differential equations to be 
solved for V(<), .S’) and V'(l. S). As will be demonstrated below, how- 
evet, this problem is not insurmountable. It will be instructive to 
concentrate on one equation at a time; therefore, in the next section 
we will study the function l'(0, S) taking K(l, S) as given. In Section IV 
we will then study V(l, S) white taking 1/(0, S) as given. In partic¬ 
ular, we will show that under the simplifying assumption of a perf ect 
capital market (i.e., the borrowing constraint 12] is applicable) the 
system becomes rec ursive and a simple solution can be obtained. 


III. The Optimal Depletion of Emergency Reserves 

Necessary conditions lot maximization of (4), taking the function V( I, 
Si) as given, are 

s? = u'(fji) ( 7 ) 

x" = (r + e„)xj’ - e„v\(i, s,). (8) 


where the superscript 0 indicates that the costate variable is the one 
associated with the maximum value function V(0, .S’)—that is, an em¬ 
bargo ic gitne—and where the dot indicates the lime derivative. Fur¬ 
ther. V',(I, .S',) = rtV'CI , S,)/<iS,. Note that we have disregarded the 
shadow prices associated with the nonnegativity constraints on S, and 
c/,. since the infinite slope of ;/(</) at q = 0 will always ensure strictly 
positive values foi S, and c/,. 

1 lie sjiot price h\ is defined as the marginal value of oil in stock as 
long as the '•mbaigo is going on. An intuitive interpretation of (8) can 
he obtained it we write the price path in the following form: 



+ So 


rv41.Sc) 



( 9 ) 


I his formula is in fact a special case of the Hotelling (1981) princ iple 
of resource depletion. It simply says that if the embargo is still going 
on at date /, then the expected rate of price increase of oil in stock 
should be equal to the rate of interest. That the left-hand side of (9) 
could be interpreted as the expected rate of price increase is evident 
from the fact that is the spot shadow price of oil in stock during an 
embargo while 1/(1, S,) is the marginal valuation of oil in stock in case 
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the embargo should end. Thus, F,(l, S { ) - X 6 is the marginal capital 
gain (or rather capital loss) from holding oil, should the embargo 
expire, and 0o[T,(l, S t ) — X 6 ] is the expected marginal capital gain. 6 

Equation (9) is identical to the one obtained by Dasgupta and Stig- 
litz (1981) in their study of resource depletion under technological 
uncertainty. In their terminology, F s (l, S t ) is the fallback price of the 
resource after the new technology has been introduced. The formula 
differs from the one derived by Dasgupta and Heal (1974). In that 
paper the authors assume that V(\, .S',) = K,(l, S,) = 0, which means 
that after the breeder (or the solar energy) has been introduced, all 
remaining oil will be worthless. 7 

To obtain a solution to the problem of optimal depletion when the 
embargo length is uncertain, we thus have to solve the system of 
differential equations 

X'/ = (r + 0„)X'/ - 00^(1, S,) (10) 

S, = -w'-Vx?). (11) 

For each function V\( 1, S,) this system has an infinite number of solu¬ 
tions {X 6 }. {S',}. We must therefore impose some initial conditions on 
the paths. For {S,} this is self-evident; it must hold that 

S 0 = S. (12) 


h For the particular cast ot instantaneous stock adjustment analyzed in Set. IV below, 
the necessary condition (9) will be of the same functional form as the necessary condi¬ 
tion for the deterministic problem of optimal depletion of a natural resource with a 
constant marginal extraction cost (see further Sec. IV). This parallel does not hold, 
however, fot the geneial case (9). 

7 lollry and Wihnan's (1977) paper on oil embaigoes uses a formal setup quite 
ditleient (torn that o( our model. It is thus hard to relate their results to ours, but some 
comparisons can be made. They make two special assumptions that together result in a 
policy different from the one implied by (9). First, they assume explicitly that thi' 
embargo period is so short that the discount rate can be disregarded, i e.. that r = 0 
Second, they treat the utility ot oil remaining in storage at the end of an embatgo in a 
quite dif lei cm manner. In fact, they assume that the oil sloc k should be completely 
exhausted during the embargo, and they calculate the optimal date /. (which is less than 
or equal to the duration of the embargo) at which the stotk should be depleted. This 
results in an optimal policy saying that the rate of price increase (Xj'/X" in our notalion) 
should be equal to the probability rate that the embargo will end in the next moment, 
given that it has not already ended. With an exponential probability distribution lor the 
duration of the embargo, this conditional probability rate is equal to out parameter H,,. 
7 Inis, (9) would imply the same policy as dial of Tolley and Wilman if we set r = 0 and 
f,(l, S,) = 0. This scents, however, rather restrictive In general, the enibaigo period 
might last for a long time, the average length being 1/0 O , and the interest costs could lie 
rather high. Further, with our assumption of an infinite marginal utility u'(q) at q = 0, it 
can never be an optimal policy to deplete the stock before the embargo has expired 
And thus some oil will always remain at the terminal dale This oil could in general be 
used later (in the free-trade regime that follows) for some purpose, e.g., consumption, 
)r refilling the emergency reserves in case a new embargo should occur, or selling in 
he world market, 
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j o obtain the initial price Xo, we impose the resource constraint 

= S. 03) 

Jo 

Of course, the emhaigo will end in finite time with a probability 
arbitrarily close to unity under our assumptions, so the proper inter¬ 
pretation of (13) is that the inventories should be allocated during the 
embargo so as to ensure consumption at each date [u'(0) = °°], but not 
to leave redundant stocks asymptotically. 

I,ft us denote the initial shadow price X,'!. given by (12) and (13), by 

X,'! = A&S, V,(l. •)]. ( )4 ) 

where the notation V\(l, •) is used to indicate that X<> does not depend 
simply on a particular value of the V\(l, •) function, but on the entire 
functional form. The system (10), (11), together with the^endpoint 
testrictions (12) and (13), thus gives us the optimal paths {X”}, {S,} for 
eat It initial stock ,S and each given function V(l, •). For a formal proof 
ol the existence and uniqueness of a solution, the same method of 
proof as the one ol Dasgupta and Slight/ (1081, p. 103) can be ap¬ 
plied. In the next section, we will deal with the V(l, •) function. 


IV. Optimal Storage during Free Trade 

As soon as the embargo lias expired and a regime of free trade has 
begun, the country will start to build up its emergency reserves again. 
The optimal stockpiling policy during free trade must satisfy the 
functional equation ((>). The solution is given by the system of differ¬ 
ential equations 

X/ = (r + e,)x; - 6,^(0, S,) (15) 

S, = ^ - u- '(X, 1 ), (16) 

P 

where the superscript 1 indicates that the shadow price X, 1 is the one 
corresponding to (6), that is, to a free-trade regime. 8 We note that 
tlie entity V,(0, .S’,) appearing in (15) is the marginal valuation of an oil 
stock S, if the free-trade regime should end and an embargo be im- 


H Since u'(q) goes to infinity as </ —* 0, we know that we will always have an inner 
solution with respect to q,\ i.e., X,' u\q,) will always be satisfied as an equality. Note, 

however, that we have disregarded the shadow prices associated with the nonnegativity 
constraints on z,. Theie is no reason to dismiss a corner solution z, — 0 as impossible per 
se; it applies to a country with such a low level of real income that the whole potato crop 
will be sold to finance the purchase of oil. This case is analyzed in Bergstrom, Loury, 
and Persson (1(183). Here we consider only the situation with an inner solution. 
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posed at time t. That is, it is identical to the Xq[ 5, I 7 ,! 1, •)] of (14) in the 
previous section. 

It is easily shown that two paths {X, 1 }, {5,} satisfying the equation 
system (15), (16) and leading to the steady state (X 1 *, S*) satisfy the 
transversality conditions. 9 Thus these paths are optimal. Using the 
same notation as in (14) above, we denote the initial value of the {X, 1 } 
path by 

X« = X(i[5, V,(0, •)]. (17) 

Identifying K,(0, •) with X«(-) and U 5 (l, •) with X,j(-), (14) and (17) thus 
form a nonlinear equation system in the unknown functions X{|( ) and 
X({(-). This system is very difficult to solve in general. 

It may seem intuitively reasonable to set U,)], S,) equal to the world 
market price p\ in a free-market equilibrium the marginal valuation 
of a unit of oil should be equal to the relative price. That is not true, 
however, in the absence of a capital market. During the embargo, the 
country has been depleting its oil reserves in the fashion analyzed in 
Section III above, and when the free-trade regime begins, the size of 
the remaining stock will in general be below the optimal size S*. An 
instantaneous stock adjustment (S* - S,) is possible only if a perfect 
international credit market exists. Otherwise the country’s purchases 
of oil in the world market are constrained by the borrowing restric¬ 
tion (1). Thus to build up the stockpile will take some lime, and 
during this interval (which will be shorter, the shorter was the preced¬ 
ing embargo) the marginal valuation of oil will be greater than the 
world market price. 10 Immediately after an embargo, the country w ill 
thus find itself in a corner solution; the oil reserves are so small that it 
must use its entire potato harvest to purchase oil." In that case, V,(l, 
St) > p. However, if one can assume that the adjustment to the optimal 
stock can be accomplished instantaneously, once the free-trade re¬ 
gime has started, the equation system (14) and (17) can be solved and 
analytical solutions be obtained. 

q The optimal paths are analyzed by means of phase diagrams ill Bergstrom et al. 
(1983) 

1 Note that this does not depend oil our assumption of a constant marginal utility of 
potato consumption If the utility function were written as L'(q„ ;,) « u(q,) + r(t r ) 
instead, with all the usual assumptions on i'(z,) being satisfied, we could still be in a 
situation where our oil reserves are so small that V',(l, S t ) > p. In the absence of a perfect 
apital market, the borrowing constraint (1) imposes such a restriction on the country 
hat the marginal value of potatoes is greatei than unity and the marginal value ol oil is 
treater than p. Even for a case with a perfeit capital market, V,( 1, S,) could l>e greater 
han the w arid market price, namely, if the country is so large that (by its monopsony 
lower) it bids up the world market price when refilling its reserves. While this case is 
ertainly relevant to, e.g., the strategic reserves of the United States, it requires quite a 
lifferent model structure and is therefore disregarded in the present paper 

This can be seen immediately by inspection of (6). The functional equation is linear 
n the control i„ which implies a “bang-bang” solution. 
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So from now on we will assume that a perfect international credit 
market exists. 12 The country is thus not constrained by (1) but by (2). 
Then our assumption ol a constant marginal utility of the back¬ 
ground goorl, together with the assumption of a small country with no 
monopsony power, makes it optimal to adjust the stock instanta¬ 
neously to the desired level .S'*. This means that = p{S* - S,) for t = 
T(1> while & - < 0 for all other t (see App.). 13 1’his implies that we 

are in a steady state (</*. z*, .S'*. X 1 *) during the whole free-trade 
regime. 

The equation system (15), (16) gives us the steady-state solution 
(X 1 *, .S'*) by setting X, 1 = .S', = 0. Thus 

(» + 8,)X‘* = 0|V',(O..S*) (IS) 

It' '(X 1 *) = z ~ - 2 *. (19) 

With instantaneous stock adjustment, the marginal valuation of oil in 
stock at the beginning of a free-trade regime is equal to the world 
market piice: 1,(1. .S’,) - X,! - X, 1 = X 1 * = p. This means that the 
problem can be drastic ally simplified. In fact the system of functional 
equations (4) and (6)—or, equivalently, the system of equations (14) 
and (17)—becomes lecursive, and a closed-form solution can be ob¬ 
tained 

file optima) level of inventories .S’* is now given implicitly by (18): 

v\«>. -S'*) = (20) 

This formula lends itself to a simple, intuitive interpretation. In free 
trade, the marginal cost of storing one additional unit of oil is p. The 
marginal benefit is the expected, discounted marginal value of that 
unit should an embargo be imposed. The marginal value at the begin¬ 
ning of tins future embargo is V\(<), .S'*), and since it occurs at the 
random date T|, the discounted value is e“’ TI K,(0, >S’*). The date iq is 
exponentially distributed; thus 

E x \e rT W,((), S*)J = 9,, lr ' e ' )T 'cfx, V,(0. S*) 

Jo 

* rrV'' 0 ' s * ) ' 

* For a mote detailed discussion o1 the case with no credit market, see Bergstrom el 
al (1983). Even without the assumption o< a perfect capital market, the parameter 
sallies 0„, 0,, p, and/or Z could ol course lie sui h that the reserves can be fairly quickly 
restored to the level .S'* after an embargo. Thus the assumption of instantaneous stock 
adjustment could he a teasoiiable approximation to reality and thereby a permissible 
simplification, even if no capital market exists. 

15 Rigorously speaking, lim iT . f™ ' At i,ilt = p(S* - .S',). See n. 2 above. 
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Formula (20) therefore says that, at the optimal stock, the marginal 
cost p of storing one additional unit of oil should equal the expected 
discounted marginal benefit T t (0, -S*)[0,/(r + 0,)]. 

Since ^,(0, S) * X,*}, this implies that for a stock S = S*, 


\ 


o 

o 


r + 

~er^’ 


( 21 ) 


which gives us a more specific characterization of the initial shadow 
price during an embargo than does the general formula (14). 

Let us now return to the modified Hotelling formula (9). Substitut¬ 
ing/) for V,(l, S/). we can gain some further insight into the economic 
interpretation of the model by rewriting (9) as 




K ~ [6(V(r + fl<>)]/> 


= r + e 0 - 


( 22 ) 


Equation (22) looks exactly like the one (see Hotelling 1931; Solow 
1974) describing the optimal depletion of a natural resource with a 
constant marginal extraction cost equal to pQJ(r + 6,)) and with an 
interest rate r + 0 () . The reason pb,/(r + 0„) can be interpreted as an 
extraction cost in our model is the following. By taking one unit of oil 
out oi the stock, it has to be replenished when the embargo ends. The 
expected, discounted replenishment cost is £d|e _rT "/»] = [0 ( /(r + 0 o )]/>, 
where the expectations operator £'o[-J is defined by the density func¬ 
tion /„(t 0 ) of Section II. The discount rate r + 0 n applies since future 
consumption is discounted both for time preference reasons and be¬ 
cause the return of free trade permits the replenishment of inven¬ 
tories. 

To derive the size of the optimal stock S*, we now solve the linear 
differential equation (22) with its initial condition (21): 




r + 0, 


6 o 


-)^ (r + 


6i,)I 


00 


-P- 


(23) 


01 r + 0(» j r r + 0 o r 

Substituting this into the resource constraint (13) for S = S* gives us 


-s’* = n w'-'f/utii - - ——w<' +e ">' + 6 ---p 

Jo L\ 0i r + 0„ A r + 0o 


dl. (24) 


I he comparative statics are straightforward: 


as* 

00 o 


< o, 


> 0 , — < 0 , ill < 0 . 


00 , 


dp 


dr 


r . Some Results on Welfare 

he mode] above describes the optimal reserve management of an 
il-importing country. A question that naturally arises in this context 
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is whether one can find an expression for the cost to the country, in 
terms of expected utility, of being subject to embargo threats. In fact, 
the assumption of a perfect capital market allows us to derive explicit 
formulae for l'(0, .V*) and V(l, S*). These expressions make it possible 
to evaluate the welfare loss of an oil-importing country from being 
subject to embargo threats (see Loury 1983). 

Assume that the country is in a free-trade regime with its stock at 
the optimal level S*. It can then be shown (see App.) that the expected 
well tire is equal to 


1/(1, ,S*) = 


>' + Bp _1_ 

> ■+- 0d + 0i r 

+_!i_± 

r + 8(> + 6^ t 


V(p) + Z + c*j 
p{r + 8 1 ) 




+ Z + 


4 


(25) 


where v(p) + Z f £,* is the indirect instantaneous utility in steady 
state. 


v(p) + Z + £* = max u(q*) + z* 

q*. i* 


subject to pq* + z* = Z, + £*. Working out the optimization, we see 
that the v(p) function is v(p) = u[u' ~ ] (/t)J - pu'~\p). 

.Similarly, it can he shown that at the beginning of an embargo, the 
expected welfare of a country with a stock S* is equal to 


ms *>- , + + 


+ 


r + «i J_ ( Mr + 9.) 


(26) 


Equation (25) says that in a free-trade regime, expected discounted 
utility under optimal inventory behavior is a weighted average of the 
welfare we would have if free trade always ruled, the price of oil were 
p, and our income were V, + and the welfare we would have if free 
trade always ruled, the price of oil were p(r + 9|)/0j, and our income 
were Z + £*. Similarly, equation (26) says that given that an embargo 
has been imposed, expected discounted utility is also a weighted aver¬ 
age of the utilities of free-trade regimes with these prices, but the 
weights are somewhat different. Interpreting the weights as probabil¬ 
ities, one cart say that if free trade currently prevails, the welfare 
under an embargo threat characterized by the parameters 0 () and 0| is 
equal to the welfare of a perpetual free-trade regime with prices 
randomly fluctuating between p and p(r + 0 1 )/ 61 with probabilities 
(r + 0 u )/(r + 0 O + 0() and 9i !(r + 0 O + 0,), respectively. 
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Applying Jensen's inequality to (25) and (26), we gain some insight 
into the insurance-theoretic properties of trade disruptions: 


V(l 


r '>H( u —eh ' », ->] + ^ 


V(0, S*) > — w| 


1 + 


r + 8, 


T + 0{) + 6 j 


+ ~ (Z + £*). 


This means that, under free trade, a country that runs an optimal 
stockpile and is subject to an embargo threat (0„, 0 ( ) will never be 
worse off than a country that has entered a long-term contract guar¬ 
anteeing perpetual deliveries at a price j 1 + [r/(r + 0 O + 0 i )]}/> and 
has a flow of income equal to Z + £*. Similarly, a country now subject 
to an embargo that has a stock S* is never worse off than a country 
with a long-term contract of oil at the price {1 + Mr + e,)/(r + e 0 + 
0,)0i]}/> that has a perpetual flow of income Z + fj*. 

Thanks to the assumption of a constant marginal utility of income, 
we can solve the model for the maximum premium a above the price 
p that a country would be willing to pay for oil to l>e guaranteed safe 
deliveries for all future. For a country that is presently in a free-trade 
regime, this premium ai is given by 


V(l,S*) = — vfd + a >)p] + 

r r 


while for a country with a stock S* that is presently at the beginning 
of an embargo, the premium a<i is given by 

V(0, S*) = — e[(l + a„)p] + —. 

r r 

The V(l, S*) and V(0, S*) are just numbers, given by (25) and (26). 
The last term on the right-hand side is due to the lact that if the 
:ountry enters such a contract, it can sell its emergency stock S* and 
epay its debts, and thus it no longer has to service its debts with a 
terpetual flow £*. 


I- Concluding Comments 

n our paper we have made one basic assumption that serves to facili- 
tte the analysis, namely, that the price of oil is constant over time. As 
ointed out above, this assumption is made mainly for expositional 
?asons and could easily be dispensed with. Dropping it would, how- 
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j, avt . to be considered before it is 


CVM , iaIK a /on- pioblems that have to be considered oerore « is 

f, On'■'shrSimuhL is that if we want a more realistic description of 
, fl< , the price of oil sfiould be assumed to increase over 

lime .molding to the Hotelling rule. This is, however, not self- 
evident. Fiist, oil reser ves would be depleted according to the Hotel¬ 
ling rule (i.e., in a fashion such that the price increases at the rate of 


inieiest) only if tile oil producers work under competitive condi¬ 
tions. 11 But tins is not conformable to our assumption of embargo 
threats; il it is possible for a producer to impose embargoes, some 
kind of monopoly power must exist in the oil market. Second, assum¬ 


ing a well-functioning cartel among the producers together with a 
constant price elasticity of demand among the consumers, a price 
path p, = />,/' means that the embargo problem would degenerate 
into a trivial one the optimal stock ,S* becomes infinite. And then our 


assumption that the oil-consuming nation is a small, price-taking 
(ounny would no longer hold. That .S'* is infinite is easily seen from 
the fact that with the world price of oil increasing (deteiministically) 
at the rate r, which is also the interest rate, oil would dominate other 
assets in the c on tit i y s portfolio. It would be costless to borrow huge 
quantities of money and hold it in the form of oil; if no embargo 
o< c ui s. sue h a policy breaks even since the cost of borrowing is exactly 
man bed by the capital gam of bolding the asset, and if an embargo is 
imposed the holding of oil becomes more profitable. To cope with 
tins, we must imloduce some imperfections in the international credit 
markets, ot we must introduce storage costs or uncertainty about 
future oil prices into the mode! to make oil a less dominating asset in 
the country’s pottfolio. 

/ hi id, assume that the price of oil increases at some rate p that may 
or may not be equal to the rate ol interest. We still have to decide what 
lappens to the world market price during the embargo. If the oil- 
ini porting country is a small orte and if no other country is affected by 
the embargo, it is reasonable to assume that the world market price 
will continue to increase at the rate p throughout the embargo. If the 
importing country is a large one, however, or if a large part ol the 
world ,s subject to the embargo, one could imagine that the world 
market price of oil would rise at a lower rate—or even remain con¬ 
stant—and not start to increase at the rate p again until the embargo 
has been called off. 1 he solution to our country’s decision problem is 
of course affected by which one of these two scenarios is considered to 
be the most realistic one. 


14 Or under monopoly, if demand lias a 


constant price elasticity. 
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Appendix 

/. The Derivation of the Welfare Formulae 

Equations (25) and (26) can be derived in the following way. Assume that the 
country is in a free-trade regime with an optimal stock. S*. We can obtain an 
expression for V(l, S*) by evaluating the integral in (6) at q, = q* = u' ~'(/'), 

= z* = Z - /ty* + £*, and 5’, = 5*: 

1/(1, 5 *) = —i— \v(p) + Z + £*] + —?i— V(0, S*), (Al) 

1 ■+ 0 1 r + o j 

where v(p) = u[u' 1 (/»)] — feu' '(/>)■ 

Consider now the maximum value function for an embargo (4). The basic 
differential equation for this dynamic programming problem (see, e.g , Ka- 
nnen and Schwartz I9H1, pp. 241-42) is the following: 

V’((), S) = — %- V(l,S) + —max \u(q) + Z + |* - V,(0, %). (A2) 

r + 0„ r + o ( , 1 

Assume now that we start the embargo with an optimal stock S*. as we in fat t 
do with a perfect capital market. Then by (21), V,(0, .S') ^ x}{ = p(r + 0,)/0|and 
(A2) can be written 

1 r / r + 0 | \ 1 0 r , 

V(0.s*) = —— lip——!.) + Z + H + ——V(\,s*). tA3) 

r + 0(i L \ Si / J r + 0„ 

Equations (Al) and (A3) form a linear system in the two unknowns V’(0, S*) 
and V(l, .S*) with the solution (25) and (26). 

II. Tile Expression for |* 

Our assumption of a constant marginal utility of income, together with the 
assumption of a small country with no monopsony power, makes it optimal to 
adjust the stock at the beginning of the free-trade regime instantaneously to 
the desired level S*. This means that fj, = p(S* — S,) for t = t ( >, while = i* < 
l) for all other /. The constant h,* can be interpreted as the steady-state rate of 
debt service and is derived in the following way. Throughout a cycle consist¬ 
ing of one free-trade period and one embargo period (with a total duration of 
T i + T o) the loan that financed the stock adjustment is serviced. Because of the 
constant marginal utility of income, we can assume that this repayment will be 
in the form of a constant debt service rale £,*. The capital market constraint 
(2) says that during such a cycle, the loan has to be fully repaid, on the 
average. This repayment condition can be written 

Eolp(S* - ,S T „)J = r-J- £.{#■:«[ 1 - + 

where the expectations £„[•] and £,[■] are defined by the exponential density 
functions f u (ro) and /i(ti) of Section II above. This therefore gives us an 
explicit formula for the constant |*. Note that never appears in the first- 
order conditions for optimal behavior derived in Sections 111 and IV. In 
particular, S* in equation (24) is independent of £*, which can therefore be 
solved recursively from the model. This means that can actually be treated 
as an exogenous constant, as is assumed in Section II above. 

1 at >' \ 
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Current Account Dynamics and the Terms 
of Trade: Harberger-Laursen-Metzler 
Two Generations Later 


Torsten Persson and Lars E. O. Svensson 
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This is a study of the current account dynamics resulting from the 
savings and investment dynamics in a small open economy that is 
subject to exogenous changes in its terms of trade and in world 
interest rates. Anticipated and unanticipated, as well as temporary 
and permanent, terms-of-trade changes have very different effects. 
There is, however, a general tendency toward cycles in both savings 
and investment, which gives rise to cycles in the current account. It is 
shown that the classic Harberger-Laursen-Metzler effect on saving 
of a terms-of-trade deterioration can have any sign for plausible 
parameter values, both for temporary and permanent disturbances. 


I. Introduction 

This paper deals with the current account adjustment over time of an 
economy subject to changes in its terms of trade and world interest 
rates. In particular, it is shown that disturbances in the terms of trade 
may lead to quite complicated current account dynamics, because of 
the interaction between the tilted profile of consumption and the 
accumulation of capital. The paper also highlights the important dif¬ 
ferences between anticipated, unanticipated, temporary, and perma¬ 
nent shocks. 
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The current account can be written equivalently as the sum of the 
trade account and the service account, as income minus absorption, or 
as saving minus investment. With good general equilibrium theory, 
the different components in the various identities are of course simul¬ 
taneously determined and no identity has a special significance, ex- 
cejit for conveniente. Since we apply an intertemporal framework in 
discussing the current account development over time, it will be con¬ 
venient to exploit the saving-minus-investment identity. 

So, if we choose to look at the current account as the difference 
between saving and investment, a first question is how saving re¬ 
sponds lo a tcrms-of-trade change. Earlier, the conventional answer 
to this question was summarized in the classical Harberger-Laursen- 
Met/ler effect: Harberger (1950) and Laursen and Met/ler (1950) 
jiostulated that saving out of any given income, both measured in 
exportables, fails with a tcrms-of-trade deterioration, arguing that 
real income falls with a terms-of-trade deterioration and relying on 
empirical evidence that the average propensity to save is positively 
related to teal income. T his argument relies on a static theory of 
savings, howevei, which has recently led some writers to reconsider 
the question how terrus-of-trade changes af fect savings in an explicitly 
intertemporal 1 1 timework, with forward-looking savings behavior. 

for instance, Obstfeld (1982a) applies a model of a small economy 
consisting of an infinitely lived representative consumer with an 
U/awa (1968)—type utility function with the rate of time preference 
being an increasing function of utility. Such an economy has a target 
level of real wealth, at the point where the rate of time preference is 
equal to the (given) world rate of inteiest. li the economy suffers a 
terms-of-trade deterioration its real wealth is lowered. To converge to 
the target level, it must accumulate foteign assets and hence save. 
Fhetefore, saving increases with a terms-of-trade deterioration, in 
contrast to what Harberger and Laursen and Met/ler postulated. 

Subsequent work by Sachs (1981) and Svensson and Razin (1983) 
c mphasized the distinction between temporary and permanent terms- 
of-trade deteriorations. A temporary terms-of-trade deterioration 
implies a temporary fail in income and, by intertemporal consump¬ 
tion smoothing, consumption falls by less, which deteriorates thecur- 
tent account. A permanent terms-of-trade deterioration decreases 
both income and consumption to about the same extent and hence 
has an ambiguous effect on saving. These effects are income effects of 
terms-of-trade changes. Svensson and Razin (1983) also include con¬ 
sumption substitution effects and show, by assuming suitable separa¬ 
bility of the preferences, how the static and intertemporal substitution 
effects can be organized in terms of changes in consumer price indi- 
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ces and real interest rates. They in addition clarif y the role of assump¬ 
tions about the rate of time preferences. 

As noted by Obstfeld (19826), the 2-period analysis in Sachs (1981) 
and Svensson and Razin (1983) does not give much room for dynam¬ 
ics of the current account, since a deficit in period 1 corresponds to a 
surplus in period 2, and vice versa. On the other hand, the infinite- 
horizon analysis in Obstfeld (1982a) and in Svensson and Razin 
(1983) requires that the rate of time preference increase in wealth for 
stability of the steady state, a restriction that is arbitrary and even 
counterintuitive. Also, the assumption of infinitely lived consumers 
gives rise to a very high degree of consumption smoothing and inter¬ 
temporal substitution. 

One could thus argue that a model with finite planning horizons 
seems to give rise to a more intuitively reasonable and even more 
realistic saving behavior. One such model is the well-known overlap¬ 
ping generations model (without private intergenerational gifts). Fur¬ 
thermore, it has nice stability properties, without arbitrary restrictions 
on the rate of time preference. In this paper we shall theref ore use an 
overlapping generations model to represent saving behavior. Previ¬ 
ous authors that have studied open economies by using overlapping 
generations models include Kareken and Wallace (1977), fried 
(1980), Buiter (1981), Persson (1983), and Dornhusch (1984). Our 
overlapping generations model is standard, except that we include 
two consumer goods and, as in Svensson and Razin (1983), assume 
conveniently separable preferences so as to be able to use consumer 
price indices and real interest rates. 

So much for saving. What about investment? Recently, the role of 
forward-looking investment behavior in current account determina¬ 
tion has been emphasized by Razin (1980), Marion and Svensson 
(1981), Sachs (1981), Bruno (1982), Svensson (1982), and Helpman 
and Razin (1984). These papers, with the exception of Helpman and 
Razin (1984), all deal with 2-period models, however. Consequently 
only one investment decision is being made, which, by definition, 
excludes any investment dynamics. In this paper we will be able to 
study the dynamic adjustment of investment to changes in the terms 
of trade by using the usual straightforward representation of invest¬ 
ment in the overlapping generations model due to Diamond (1965). 

Our general findings below are that in order to understand the 
current account behavior over time, it is useful to look at the induced 
changes in the intertemporal prices, as represented by the various 
real interest rates, rather than directly at the static terms-of-trade 
changes, since saving and investment may be more unambiguously 
related to changes in the intertemporal prices. In analogy with the 
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work mentioned above, we find that temporary and permanent 
lercns-of’-trade and interest changes have different effects. So do an¬ 
ticipated and unanticipated terms-of-trade changes. In particular, we 
show that cyclical adjustments in the current account are likely to 
occur. 

The paper is in eight sections. The model is set up in Section II. 
Sections III and IV deal with anticipated permanent and temporary 
terms-of-trade deteriorations. Section V treats unanticipated terms- 
of-trade deteriorations, and Section VI changes in the world rate of 
interest. Section VII discusses the Harberger-Laursen-Metzler effect 
in some detail. Section VII1 presents a summary, qualifications, and 
some conclusions. 


II. The Model 

We consider a small open economy that faces perfect world markets 
for goods and financial assets. There are two traded goods, home 
goods and foreign goods, indexed h and /. The economy produces 
and exports home goods and imports foreign goods. The foreign 
good is numetaire. Hence, in each time period t, the economy can 
trade f reely at a given price of home goods in terms of foreign goods. 
Similarly, the economy can borrow and lend on the international 
credit market at a given interest rate. The interest rate in terms of the 
imported good, between periods t and t + 1, is denoted by r 1 . 

We first describe the production side of the economy. We assume 
that only home goods are produced at home. Production is carried 
out by two {actors, labor and (nondepreciating) capital, according to a 
well-behaved neoclassical production function. There are constant 
returns to scale and the labor supply will be fixed at unity, so produc¬ 
tion in period ( can fie expressed as y‘ = f(k'), where k' is the econo¬ 
my’s capital stock. Only home goods are used as capital goods. Little is 
changed if instead only foreign goods are used as capital. (In the 
concluding section we discuss the implications of having a more gen¬ 
eral two-sector production sector, which complicates the analysis con¬ 
siderably.) 

Capital goods to be used in production in period ( + 1 must be 
purchased and installed in period t. With these assumptions, profit- 
maximizing behavior implies that the capital stock in period / +■ I is 
given by 

Mk ,+ ') = r‘ h . (1) 

I hat is, what can be thought of as gross investment, namely the 
amount of physical capital held from periods (to ( + 1, is carried to 
the point where the marginal productivity of capital fn in period t + 1 
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equals r‘ h , the home goods’ own rate of interest between periods t and 
t + 1. The latter is defined by 

1 + rj, = (1 + r'l—f-f- (2) 

and is thus just a way of expressing the intertemporal relative price of 
home goods.' Equation (1) defines k‘ + 1 as a decreasing function 

k‘ +, =k(r‘ h ), k r <0, (3 ) 

of the home-goods rate of interest. Let us denote the value of the 
capital stock held at the end of period t by K‘. Then, we can write K‘ as 
a function of the terms of trade and the home-goods interest rate, 
namely 

A" = p'k' + 1 = p‘k(r‘ h ) = K(p',r‘ h ), (4) 

with the derivatives K p > 0 and K, < 0. 

From (4) and the definition of the elasticity of substitution in pro¬ 
duction cr\ the partial elasticities of this function are 


and 



(5) 


where 0 is labor’s distributive share. In general, or 1 and 6 vary with the 
capital intensity of production. The precise values of these parame¬ 
ters are unimportant for our results, but for specificity in the follow¬ 
ing, we assume that over the range of terms of trade and interest rates 
with which we deal, 


CT ^ e, (7) 

which makes eA7er A =£ - 1. 

Wages in period t are given as 

W' = p'[f(k l ) - h'f k (k‘)] = p‘w(k‘) = W(p‘, k'), (8) 

an increasing function of the terms of trade and the capital stock. We 
note for future reference that w(k') is the product real wage in period /. 
Having thus described the production side of the economy, let us 


We note that (1) and (2) can be written [p‘* 'fkik'* ’) + (*' f 1 - p')Vb‘ = which 

mrc’rtodTm 61 *' 1 lhal ‘ he return,n Period t + 1 to investing in capital goods 

^ ’ ^asured ln foreign goods and including capital gains, equals the return to 

investing m the foreign asset, the rate of interest r‘ 



JOURNAL OF POLITICAL ECONOMY 

turn to consumers. Since growth is not essential to our story, we 
assume a stationary population. All consumers live for only 2 periods. 
In each period, a young and an old generation thus overlap (we set 
the numbei of consumers of each generation equal to unity for sim¬ 
plicity). Both goods ate consumed, and we denote consumption of 
young consumers in period t of the two goods by c* and c'f while 
consumption of old consumers in period t is d' h and d'j. 

Young consumers have a fixed endowment of one unit of labor that 
is inclasrically supplied. Part of the wages received is consumed, part 
is saved. Old consumers do not work but consume principal and 
interest from their savings. There are neither bequests nor gifts given 
to old people, so consumers start and end their lives with zero endow¬ 
ments. 

Consumers are identical in all respects, and their preferences over 
consumption as young and old can be represented by a well-behaved 
utility function of the special form V(c', d' + 1 ), where c‘ = f/(c}„ cj) and 
d" 1 = V{d'h* ', d'i* '). We assume that preferences are homothetically 
separable over time. 'Mien we may, without loss of generality, choose 
the subutility function U(r/„ (j) linearly homogeneous. Finally, we 
assume that the subutility function is the same in both periods. These 
assumptions make it possible to deline exact consumer price indices, 
which will be convenient lot our analysis. Also, the scalars i' and c/' 1 1 
can be imerpicted as measures of real consumption in the 2 periods. 

In determining their consumption when young, consumers max¬ 
imize the utility function subject to the life-cycle budget constraint 
that the present value of their total consumption is equal to their wage 
income (which is their wealth). 


(/* V^, + c'f ) + 


d'r 


i + 


= w. 


m 


l.et (,' — ft'c'i, 4- r'f denote (the value of) consumption of tlit- young in 
period I. Under the assumptions about the utility function, this can be 
written as a function 


C' = C(/i',p',W") (10) 

of the terms of trade, of the real rate of interest p', and the wage (see 
App.). The real rate of interest relevant for consumption decisions is 
defined by 

1 +p '“ <l ( " ) 

where Pip’) is the exact consumer price index in period t (given by the 
unit-subutility expenditure function corresponding to the subutility 
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function U[c/„ cf])- 2 The difference between wages and consumption 
of the young is savings of the young, S'. It can be written as a savings 
function 

S' = W' - C{p', p', W‘) = S(p‘, p', W‘), (12) 

which is also a function of the terms of trade, the real rate of interest, 
and wages. Since the properties of this function are crucial for our 
story, we will look at them in some detail. Note first that the terms of 
trade in period t + 1 do not enter as a separate argument in the 
savings function. This follows from the separability of the overall 
utility function V(-). However, p' +1 does affect the savings decision 
indirectly via its effect on the real interest rate (compare [11]). 

The partial elasticities of the savings function with respect to its 
three arguments are (for details, see App.): 


eS 


ep 


(l - a) 

a 


P(1 


“ 1 ). 


(13) 


tS 

e(l + p) 




(t) - cry) 


(1 - a)(cr - 7), 


(14) 


and 


je5_ = 1 - (1 - a)y 

*W a V ’ 

Here, a = S'lW 1 > 0 is the savings ratio, (5 = p'c'JC' > 0 is the share of 
home goods in consumption when young and hence also the share 
of home goods in the price index, and y = W'C' w /C' > 0 is the wage 
(or income, or wealth) elasticity of consumption when young. Looking 
at (13), we see that current terms-of-trade changes affect savings via 
changing the price index P(p'). This by itself changes the nominal 
value of current consumption. However, the change in P(p') also has 
a wealth effect on real consumption, which may or may not exceed 
the valuation effect according to whether y is above or below unity. In 
the following we assume y — 1—a unitary wealth elasticity. A 
sufficient condition for this is that the overall utility function V(c', 
') is hotnothetic in real consumption when young and old. With 
this assumption, changes in the terms of trade do not change savings. 


We see that if the terms of trade change over time, p and r will be different. Such a 
distinction between the real rate of interest relevant lor consumption decisions and tile 
world rate of interest has recently been emphasized by Dornbusch (1983) and Obstfeld 
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provided that the real interest rate and wages are held constant; *S/ep 
= 0 . 

As for (14), Tj = t| - °ry = "(I + p')[dC/3(l + p l )]IC‘ S 0 is the 

negative of the elasticity of consumption when young with respect to 
one plus the real interest rate, ?| = aa > 0 is the negative of the same 
(ornpemated elasticity, and cr > 0 is the elasticity of substitution between 
real consumption in the 2 periods. Expression (14) thus follows from 
the Slutsky equation on elasticity form. It is seen that savings increase 
in the real interest rate, that is, eS/e(l + p) > 0, if the intertemporal 
substitution effect on consumption when young dominates the wealth 
(income) effect. Subsequently, we indeed take this to be our “normal’' 
case; that is, for the range of prices and interest rates dealt with below, 
we assume cr > y (in general <t, as well as a, varies with the level of 
savings). 

The wage elasticity of savings expressed in (15) is positive as long as 
goods are normal. When y = 1, as we have already assumed, this 
elasticity is unity. 

In summary, we have thus assumed cr > y = 1, which gives 

S f , = 0, S p > 0, S w >0. (16) 

As with our earlier assumption with regard to capital formation, these 
assumptions are mainly for specificity so as to avoid a taxonomic 
catalog of comparative static results. For any other set of assumptions, 
the reader can easily work out his or her own results by help of (6) and 
(I3)-(15). 

Since there are no bequests, total wealth at the end of period /, A‘, is 
identically equal to the savings of the young generation, namely 

A' = S' = S(p‘, p', W‘). (17) 

The economy’s total claims on the rest of the world are the differ¬ 
ence between the total wealth of consumers and the value of the 
capital stock. Denoting (the value of) foreign assets at the end of 
period t by F‘, we thus have 

F' = A 1 - K‘. (18) 

I t is explicitly assumed that foreigners do not hold claims on the home 
capital stock (equities). Hence, the net and gross holdings of foreign 
financial assets (debt) coincide. 

We may now readily define the current account surplus B‘ in period 
l as the increase in foreign assets from period t — 1 to t, that is, 

B‘ = F‘ - F‘~ l , (19) 

which, of course, can be rewritten as B‘ = (A‘ - A‘~ *) - (K‘ - K'~'), 
that is, saving minus investment. Savings by the young are positive 
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and given by A‘ — S‘ = W‘ — P(p‘)c‘. The old sell all their assets to 
maintain a consumption higher than their income from domestic and 
foreign assets. Hence their savings -A 1 ' 1 = + F‘ ') - 

P(p‘)d‘ are negative. Using these expressions for A' and A‘~ \ con¬ 
stant returns to scale, and the parity condition in footnote 1, it is 
straightforward to rewrite the current account surplus as 

S' = p‘y‘ + r‘F‘~ 1 - P(/>‘)(c' + d') - p‘(k‘+ l - k‘), 

that is, GNP minus the sum of consumption and investment. 

If the internationally given interest rate and terms of trade are 
constant, the economy will converge to a stationary state where all 
variables are constant. 3 It is not necessary, but convenient, to assume 
that before the changes in the terms of trade that we shall discuss, the 
economy is in a stationary state. In such a stationary state we have, for 
all periods t, 

P‘ =P< 

r‘ = r‘ h = p' = r, 
k‘ = k = k(r), 

K‘ = A' = pk(r), 

H ( 20 ) 

W 1 = W = W(p, r), 

A' = A = S(p,p,W), 

F‘ = F = A - K, 

B‘ = B = 0. 

It is worth observing that although the current account is zero in a 
stationary state, the trade balance is not, as long as F is nonzero. II the 
economy has positive foreign assets—say, it has interest income from 
abroad (GNP is higher than GDP) that allows a higher consumption 
than production—that is a deficit in the trade account. 

III. An Anticipated Permanent Terms-of-Trade 
Deterioration 

Let us suppose that the terms of trade change permanently in period 1, 
when the economy previously has been in a stationary state. To fix 
ideas, we consider a terms-of-trade deterioration. The foreign-goods 
interest rate stays constant, however. In a more complete treatment 


' With a given world interest rate r and given world market price P the economy is 
always stable. (The given world market prices and interest rate hx the capital slock, 
which gives wages, and hence they also hx savings.) 
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where the terms-of-trade change is seen as an endogenous response 
to some worldwide disturbance, that same disturbance would be likely 
to affect also the world interest rate. Our treatment of terms-of-trade 
changes in isolation is thus not motivated by realism, but we hope it 
may help to clarify the dif ferent channels through which the current 
account adjusts. In Section VI we consider the effects of isolated 
world interest changes, so the effects of any combined terms-of-trade 
and interest rate change can be found as the appropriate linear com¬ 
bination of our results. 

I he adjustment over time of ail the variables of interest is com¬ 
pactly summarized in figure 1. Here we have relied on the assump¬ 
tions about parameters regarding technology and preferences that 
were made in the previous section. 

In this section we treat an anticipated terms-of-trade deterioration; 
that is, all agents have perfect foresight about its occurrence. Then, as 
illustiated in tfie figure, the home-goods and real rates of interest 
me lease in period 0, the former by more since the price of home 
goods in period 1 falls more than the price index.' For all other 
periods the two interest rates do not change. 

Because of the increase in the interest rate the capital stock used in 
period 1 is lowered, relative to the stationary state. Therefore the 
value of capital at the end of period 0 is lower. From the end of period 
1 onward the physical capital stock is again at its unchanged stationary 
level, but its value is lower. As illustrated in the diagram, the drop in 
A" exceeds that in A 1 , though. ’ 

What about wages? They go down in period 1, both because home- 
goods prices are lower and because the capital stock is lower. From 
period 2 on they are lower relative to the previous stationary state, 
because prices are lower, although higher than in period 1, since the 
product real wage is back to its unchanged long-run level. 

Knowing the above, we know how savings and hence how total 
wealth respond. Savings go up in period 0 because of the rise in the 
rate of interest. In period 1 savings go down because wages are lower. 
The terms of trade are lower too, but this by assumption has no direct 
effect. Savings are lower in period 2 and onward, but they decrease 
less than in period 1, since wages are then higher than in period 1. 

Given all this, it is dear how foreign assets develop, and from the 
changes in foreign assets it is easy to read off the development of the 
current account over time. As can be seen from the figure, the cur¬ 
rent account undergoes a 3-period oscillating pattern of surplus- 
deficit-surplus. 

4 Wf have, from (2) and (I I). (1 + r") = -p‘ > and (I + p°) = -fi'p 1 

5 We have A' 1 = p < 0 and A? 0 = = (-(776)1(1 + r h )lr h ](-p) = (cr'/0)[(l + 

r h )fr k \p < 0. Clearly A 1 ’ < K 1 < 0 if <r’ > 0, as we have assumed in (7) 
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In drawing figure 1, we have assumed that the country initially has 
positive foreign assets. In the new stationary state this is stilt the case 
but the foreign assets are smaller. Indeed, as the reader can verify 
from (5), (8), and (15), under our assumption that preferences are 
homothetic ( 7 = 1 ) the stationary level of foreign assets is linear in the 
terms of trade. 
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IV. An Anticipated Temporary Terms-of-Trade 
Deterioration 

Think, now instead of an anticipated temporary terms-of-trade deterio¬ 
ration that occurs in period 1. The economy’s adjustment to this 
shock is portrayed in figure 2. As for the permanent terms-of-trade 
deterioration, the home-goods and real rates of interest go up in 
period 0. But, in period 1, when the terms of trade are temporarily 
low, future goods are relatively more expensive, which means that 
these interest rates go down relative to their stationary state level. 

Because of the response of optimal capital intensity to the changes 
in the home-goods rate of interest that we illustrate in the figure, the 
physical capital stock is lower at the end of period 0 and higher at the 
end of period 1. Its value follows the same pattern, since the valuation 
effect of the lower prices of home goods in period 1 is dominated by 
the volume effect.'’ 

Wages develop in the same way as capital intensity in production, 
going down in period 1 and up in period 2. As in the previous experi¬ 
ment, the negative effect of a lower product wage in period 1 is 
further reinforced by the lower product prices. 

Savings by the young generation and thereby total wealth adjust as 
follows: upward in period 0 because the real interest rate is higher; 
downward in period 1 because wages and the interest rate are both 
lower; and upward in period 2 because wages are higher. 

Putting the pieces together, we gel the development of foreign asset 
holdings and the current account in the figure. In the new stationary 
state, foreign assets have, of course, returned to their previous long- 
run level, since the change in the terms of trade is only a temporary 
one. The adjustment toward the new stationary state stretches out 
during 4 periods under which the current account undergoes a se¬ 
quence of surplus-deficii-surplus-deficit. 

At this point it may be useful to pause and try to understand what 
lies behind this quite asymmetric adjustment. As a starting point we 
may take the cycle in home-goods and real interest rates, illustrated in 
figure 2, that the temporary drop in the terms of trade gives rise to. 
The changes in interest rates by themselves drive gross saving (total 
wealth) and investment (the value of the capital stock) in different 
directions with immediate consequences for foreign assets and the 
current account. But also, the slump and the following boom in in¬ 
vestment affect capital intensity and hence wages 1 period later. Since 


"This follows, since K' = (tKlep)p' + = p' - (ct76)[(1 + r' k )/r\]p' ^ 0, as 

long as a's 8 
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saving and wages are positively related, the investment cycle thus 
leads to a similar cycle in saving, hut with a 1-period lag. I his too 
contributes to die cycles in the current account. 

These leatui es of the adjustment are not limited to the present 
experiment' the same intuition explains the results in the previous 
and subsequent sections. 


V. Unanticipated Deteriorations in the Terms 
of Trade 

Suppose now that there is a permanent terms-of-trade deterioration 
m period 1 that, unlike the othei cases we have considered, is unantici¬ 
pated That is, it occurs as a sut prise in period 1. The consequences for 
the economy of such an unanticipated change in the terms of trade 
are illustrated in figure H. The important difference to the case with 
an anticipated terms-of-trade deterioration is that here the intertem¬ 
poral relative prices relevant for decision making, that is, the (ex¬ 
pected) home-goods and real rates of interest, do not c hange in pe- 
tiod 0. 1 his means that the whole path of interest rates is not affected 
at all. 1 hen, of ionise, the physical capital stock stays constant, and 
the change in its value is limited to the el led of the tall in home-goods 
pi ices from period 1 onvvaid. 

Similaily, the effect cm wages is just that of the fall in prices, and 
this change m wages is indeed the only channel whetebv saving is 
affected. 

I he fact that saving and investment do not react in advance to take 
advantage of the forthcoming price change tints leads to the smooth 
adjustment illustrated m the figure. 1 he economy settles directly on 
its new stationary slate path with a (proportionally) lower level of 
foreign assets, which is leached via a one-shot deficit m the current 
account. 

If tfie detetioration in the terms of trade is unanticipated but tem¬ 
porary, we get a different picture, as figure 4 illustrates. Here, the 
period / home-goods and real interest rates fall, since the terms of 
trade are back at their initial level from period 2 onward. We do not 
have to go into a detailed explanation of the diffetent pieces of the 
adjustment process. The resulting sequence of deficit-sui plus-deficit 
m the current account is really precisely the same as that for the 
expected temporary terms-of-trade change (see fig. 2), with the im¬ 
portant difference that the adjustment in period 0. the period before 
the terms-of-trade change actually occurs, is absent in the present 
c ase. 
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VI. Changes in the World Rate of Interest 

A fall in the world rate of interest is also of interest when discussing 
terms-of-trade changes in the present context. One reason is that 
every young person is a net lender. Hence from each individual's point 
of view a fall in the interest rate is a deterioration in the intertemporal 
terms of trade. Whether a fall in the world market rate of interest is a 
deterioration in the intertemporal terms of trade for the country as a 
whole depends on its net position vis-si-vis the rest of the world. In- 







deed, it is a deterioration if the country is a t reditor (F is positive) but 
an improvement if it is a debtor (/■' negative). 

Consider first the adjustment to a permanent fall in the world rate 
of interest occurring in period 1, which is summarized in figure 5. 

With no changes in relative goods prices the home-goods and real 
interest rates, of course, change as the world interest rate. 1'his means 
that the capital stock used in production increases from period 2 
onward, and thus that the value of the capital slock is higher from the 
end of period 1. Also, wages increase from period 2. 



CURRENT ACCOUNT DVNAM 1 CS 

-1 0 t 2 3 4 

—|-1-1-1-1-1—- 

--P* 



59 



S* = A* 
K* 


-l 



F 


t 


-+ 

4 



Fk, 5 


Saving and total wealth go down in period 1 as a result of the f all in 
the real interest rale. Although we have drawn an increase in saving 
from period 2 onward, saving may actually fall, depending on the 
relative force of the (negative) intertemporal substitution effect, com¬ 
pared to the (positive) effect of increased wages. 

With our earlier assumptions about parameters, and if labor's fac¬ 
tor share is not very low, it follows that the value of the capital stock 
must increase proportionally by more than total wealth in the long 
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run 7 Then, if the share of the capital stock in total wealtli is not too 
small, foieign assets must fall, resulting in the 2-period adjustment 
with a deficit followed by a surplus in the current account. 

What about the case of a temporary drop in the world rate of inter¬ 
est? Here we may rely directly on our results regarding a temporary 
unanticipated terms-of-trade deterioration. Noting that in that case 
too there is a 1-period fall in the home-goods and real rates of inter¬ 
est, we realize that the results must be qualitatively (although not 
quantitatively) identical in the two cases (the quantitative differences 
deriving frotn the fac t that here the two interest rates change in the 
same pioportion). 


VII. The Harberger-Laursen-Metzler Effect 

I.et us finally see what all this has to do with the Harberger-Laursen- 
Metzler effect. Harbetger (1950)—in discussing the effect on the 
tiacfe balance of a devaluation—and Laursen and Metzler (1950)—in 
discussing the transmission of disturbances in a two-country world 
with endogenous terms of trade and balanced trade (which they 
identified with a flexible exchange rate legime)—looked at the effect 
on spending, measured in home goods, of a terms-of-trade change. 
They postulated that expenditure out of any given income, measured 
in home goods, should rise with a terrns-of-tiade deterioration, and 
bent e that saving out of any given income should fall. Let us examine 
this proposition in the context of our model. We refer to Svensson 
and Razin (1983) for a discussion of the controversies around the 
liurherger-I.atusen-Metzler effect and for references to the litera- 
tuie. 

We considet the effect on saving in period t, measured in home 
goods, of a terms-of-trade deterioiation in periods l and / + 1, hold¬ 
ing the foreign-goods rate of interest constant, which implies that 
wages (measured in home goods) are constant too. (This corresponds 
most closely to the case with a given domestic income measured in 
home goods.) Denoting saving measured in home goods by we 
hence differentiate 


.V 


t 

h 


•S( p', p', W‘) 

t>‘ 


( 21 ) 


7 We have, it 7 = I,.S' = (1 - «)(» - I )< I + r) + W'. and W = (tu./«r,)f( I + r)/r|(l + 
r ) - ~f(l - 9)/8|((l a r)/r]( I + r), where we have used eu>/«r* = -(! - 8)/0. Hence, S 
= 1(1 — a)(<r 1) — (fl A - + r)/rj}(l + r) Furthermore. A' = (*A7tr /l )|(l + 

r)' r 1(1 + r) = -(a76)[(I + r)/r)( I + r). Thus A - K = ((I - a)(a - 1) + {[a’ - (I - 
0)f0}[fl + r)/r|)(] + i) < 0 if a 1 and <r’ - 0, provided 0 > l Ai. 
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Now, with the same notation as in Section II, (I + p‘) = $‘p‘ - 
p<+ 1 pt+ i ( anc j = p <. Then, relying on our earlier results (13)—(15), 
we may derive 

S' h = S' - p‘ 

= (1 - ot)(cr - yWp' - 3 ' + 'p ,+ l ) ( 22 ) 

+ (1 0 - 3')0 - y)p\ 

where 0 < 3' = p'c'JC' ( = p‘d‘,J[p‘d‘ h + d‘ t J) < 1 is the share of home 
goods in consumption in period t. 

Let us first consider a permanent terms-of-trade deterioration, that 
is, p 1 = P ,+ 1 = P< 0 , and assume that the share of home goods in 
consumption is the same in both periods, that is (J' = (J M 1 = p. Then 
(he first term on the right-hand side in ( 22 ) is zero and 

Si = — - ~ - a) (1 - 3)(1 - y)p § » for 7 $ 1 . (23) 

a 

Saving falls if 7 < 1, that is, if period t consumption is less than unitary 
elastic in wages (wealth), which corresponds to the case of a rate of 
time preference that is decreasing in wealth. Saving is unaf fected if 7 
= 1 , which corresponds to intertemporally homothetic preferences 
and a constant rate of time preference. Saving increases if 7 > 1, 
which corresponds to a rate of lime preference that is increasing in 
wealth. 

Let us also consider a temporary terms-of-trade deterioration, that 
is, p 1 = p < 0 and p' + 1 = 0. Then we have 

s‘ h = S.\ - a > [pa(g - 7 ) + (1 - 3)0 - 7)1P- (24) 

a 

Saving falls if the bracketed term, the weighted sum of the intertem¬ 
poral price elasticity of consumption. T| = a(o - 7 ), and of one minus 
the wealth elasticity of consumption, 1 - 7 , is positive. I hat happens, 
for example, if preferences are homothetic ( 7 = 1 ) and the substitu¬ 
tion effect dominates over the wealth effect (o > 7 = 1 ). We note that 
the Cobb-Douglas case, which has 0 = 7 = 1. gives a zero effect on 
saving measured in home goods for both temporary and permanent 
terms-of-trade deteriorations. 

We conclude that the Harberger-Laursen-Metzler effect indeed 
can be of either sign for plausible parameters, both for temporary 
and permanent terms-of-trade deteriorations. 
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VIII. Discussion 

In discussing the dynamic adjustment of the current account to 
terms-of-trade changes, we have included the substitution effects 
within each period (“expenditure switching"), which usually receive a 
great deal of attention in static treatments. We have, however, also 
included the effects of static terms-of-trade changes on intertemporal 
relative prices, as measured by various real interest rates, and how 
these changes in interest rates might influence savings and invest¬ 
ment. Indeed, we have found it very convenient to emphasize those 
induced changes in intertemporal prices in order to understand the 
current account dynamics. 

Out results suggest that it is c rucial to make a distinction not only 
between temporary and permanent hut also between anticipated and 
unanticipated changes in the terms of trade. From this viewpoint, it 
seems that attempting to derive unqualified statements about the dy¬ 
namic adjustment of the current account to terms-of-trade changes is 
a futile exercise. 

However, there is one phenomenon that recurs in virtually all of 
our experiments, namely a cyclical adjustment in the current account. 
What lies behind the swings is, to repeat our earlier argument, that 
terms-of-trade changes induce fluctuations in investment-goods real 
rates of interest, which implies cycles in investment, which with a lag 
lead to cycles in wages and income. Induced cycles in consumer real 
rates of interest lead to cycles in savings, as do the lagged cycles in 
income. Altogether, this gives cyc les in the current account. 

Out model is of course extremely simplified in that the horizons for 
consumption and investment decisions coincide and in that consum¬ 
ers earn wage income only in the first period. Allowing for costs of 
adjustment of the capital stock would lengthen the planning horizon 
for investment, and changes in the optimal capital stock would lead to 
investment charges over several periods. Allowing for consumers to 
live more than 2 periods and earn wage income in several periods 
would also spread out savings adjustment to several periods. Never¬ 
theless, the general tendency toward cycles in total wealth and the 
capital stock, and hence toward cyclical fluctuations in the current 
account, would probably remain, although the specific dynamics of 
the current account might be different. 

As for other qualifications, the assumption of only one production 
sector is a convenient simplification, but it is also important for some 
of our results. Suppose we instead adopted a two-sector production 
structure as in the Heckscher-Ohlin model. That would clearly in¬ 
troduce static substitution ef fects in production in response to terms- 
of-trade changes, and also alter the intertemporal substitution ef- 
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f ects —to make the effects on investment dependent on relative factor 
intensities, as in Razin (1980). Furthermore, such an extension would 
have implications on the consumption side. In particular, the re¬ 
sponse of savings to terms-of-trade changes would also depend on 
relative factor intensities. This is so l>ecause these intensities would 
determine the relative changes in wages and capital income, associ¬ 
ated with very different marginal propensities to save. 


Appendix 

Properties of the Consumption and Saving Function 

Consider maximizing the weakly homothetically separable utility I unction 

v‘ = V(f\d'* '), where t‘ - U(ci, cj ) and d" ' = U(d'f ', d'^ '), (Al) 

and where the subutility function U(c h , ) is linearly homogeneous We shall 
interpret the scalars c‘ and d' * 1 as real consumption in periods I anti I + I for 
a consumer who is young in period (. The budget constraint is 


(pVJ + 


</> + - 


“/ _ 


t + 


W'. 


(All) 


It is well known that the solution to maximizing (Al) subject to (A 2) can Ire 
described as follows: First, nominal consumption in both periods, C = p'i'h + 
r'f and D' * 1 = p'+'d'f' + df*', tan be written as the product of a price index 
and the corresponding real consumption, 

C‘ = P(p‘)<‘, I)'*' = P(p'* l )d'". (A3) 

Here, the exact consumer prtce index Pip) = minf/w* + r,: t'(r*. c, ) = 1} is 
the umt-subutility expenditure function associated with the subutility (unc¬ 
tion c, ). 

Second, the optimal amount ol real consumption is the solution to max¬ 
imizing V(c', d' + ') subject to tfie butlgel constraint 

P(p')c' + P - - £* ' --d M 1 = W. (A4) 

1 + r' 


Defining the real discount factor q' i] = P(p' * 1 )/[P(p')( 1 + r')] and real 
wealth (wages) w' = W^Pip 1 ), we can write the optimal real consumption in 
period t as a function c' = r(q'* 1 . ce'). Defining the teal rate of interest p' 
according to q’* 1 = 1/(1 + p'), we can then define the (nominal) consumption 
function 

C = C(p'. p', W') = P(p')c\ -—■ U ', -l (A-W 

V ^ 1(1+ P*) P(P ) J 
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The Information Content of Specialist Pricing 
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1 lus paper examines a process by which information-revealing 
prices aie delerminecl by considering the private incentives of a 
pi ice-setting agent (whom we refer to as a specialist). The specialist 
lias private information that may be (partially) revealed through his 
choice ol a pricing rule. We define an equilibrium as a pricing rule 
and a response lo that i ule by a representative trader that maximizes 
die expected utilities of die specialist and die trader, conditional on 
each having rational expectations. By analyzing the existence and 
nature ol this equihbi nun, we attempt to develop further insights 
into the behavior of markets witfi incomplete information. 


I. Introduction 

I he purpose of this paper is to deal simultaneously with two prob¬ 
lems the literature on linance and economics has tended to address 
separately. On the one hand, the literature has considered the prob¬ 
lem of determining the information content of prices without detailed 
modeling of the process by which prices are formed. On the other 
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hand, the literature has dealt with the process of price determination 
without paying much attention to the mechanism by which any infor¬ 
mation contained in prices satisfies a rational expectations equilib¬ 
rium . 1 By considering these two problems jointly, we hope to obtain 
further insights into how speculative markets work in a world of 
incomplete information. 

Our approach is to analyze a model in which a specialist explicitly 
considers both the information content and the trading implications 
of the price he sets. Specifically, we consider a model of a market in 
which a specialist and a representative trader exchange a riskless 
numeraire good for a risky asset. Before trading occurs, both the 
specialist and trader are endowed with private information and assets. 
After trade occurs, the random return on the risky asset is realized, 
and the trader and specialist consume the liquidation value of the 
final portfolio they own. Trade itself is a two-step process. In step one, 
the specialist sets a price . 2 In step two, the trader chooses the quantity 
he wishes to buy or sell at that price . 3 * 5 Note that we restrict the analysis 
to a single-price case (e.g., multipart pricing or bid-ask spread pricing 
are not considered). 

The specialist’s selection of a price to maximize his expected sur¬ 
plus provides a rationale, or process, by which prices are formed. The 
specialist is not allowed to make the price contingent on the quantity 
purchased. Thus, in the second step in the trading process, the trader 
behaves as a pure price taker. In this way, we are able to link private 
incentives to set prices, through the introduction of the specialist, with 
the competitive behavior of traders exhibited in markets in which 
prices are set by a Walrasian auctioneer. 


1 No attempt is matte here to provide a comprehensive listing of all ihe ariirles that 
have dealt with these topics. A partial listing of some of the relevant papers includes 
Kihlstrotn and Mirman (1975), Garman (1976). Grossman (1976, 1976), Jalle and 
Winkler (1976), Bradfield and Zabel (1979), Wilson (1979). Zabel (1979), Gould < 1980). 
Grossman and Stiglitz (I960), Hellwig (I960. 1962), Verrecchia (I960, 19621, Diamond 
and Verrecchia (1981), Milgrom (1961), Grinblatl and Ross (1962), and Milgrom and 
Weber (1982). 

2 In practice, a specialist can play both an at live role and a passive role in price 
determination. In his passive role, the specialist maintains a “book" of limit orders and 

fills market orders from this book at the price that is besi from the viewpoint ol the 
customer placing the market order. In his passive role, the specialist is purely a broker 

who is compensated by brokerage fees In the active role, the specialist actually takes a 
market position himself. Gur analysis considers only the active role of the specialise we 
do not analyze the nature or behavior of limit orders Moreover, we do not allow the 
specialist to limit his exposure—once he sets price he must balance the market bv 
adjusting his own portfolio to cover any difference between the demand and supplv 
quantities offered by traders. We make these abstractions to avoid unnecessary compli¬ 
cations, not because we think that further extensions of the model would be uninter¬ 
esting. 

5 We assume that the specialist can issue shares in either the risk* or riskless asset, it 
necessary, to cover any excess or negative demand. 
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If both the specialist and the trader had the same information, our 
problem would be similar to the standard textbook model of monop¬ 
oly. 1 This is not the case, however, when the specialist has private 
information, because then the trader’s demand function depends on 
what fie thinks (he specialist uses to set prices. This means that some 
care needs to be taken in defining an equilibrium. 

Hie price rule the specialist uses in setting a price potentially re¬ 
veals his private information to the trader. The specialist selects a 
ju ice to maximize his expected utility in anticipation of the trader’s 
demand at that price, while carefully weighing the fact that the trad¬ 
er’s demand is affected by what he learns from the price announce¬ 
ment. I he trader, for his jrart, infers information on the basis of a 
conjecture about the price rule used by the specialist, which influences 
the trader’s demand. Therefore, in our model, we define an equilib- 
num as a price rule and a response to that rule that maximizes the 
respective expected utilities of the spec lalist and trader conditional on 
each having rational expectations. 

Because price reveals information, we assume that the specialist can 
disguise, or mask, whatever rate he selects for exchanging risky versus 
risk-free assets. 1 he specialist does this by adding a noise term to the 
exchange rate he selects in determining the price he quotes the 
trader. ’ Although we initially assume that the level of noise is an 
exogenously specified parameter, in Section IV we discuss what par¬ 
ticular level of noise is optimal from the specialist’s perspective. 

A brief outline of the paper is as follows. In Section II we formally 
define an equilibrium to the problem outlined above. As a way of 
illustrating the nature of this equilibrium, in Section III we introduce 
specific assumptions that allow closed-form expressions. In Section 
IV, we examine the role of the level of noise in influencing the special¬ 
ist s surplus and, in Section V, we consider the realism of our model in 
comparison with how a specialist is thought to operate. In the final 
section, we summarize our discussion. 


II. Definition of an Equilibrium 

In this section we define an equilibrium to a market for risky assets 
whose price is set by a maximizing monopolist. We assume that the 

be'nTvai.ve^ : h,rh ,S T wnst ‘l uc,mal - 15 dial die quantity demanded may 

sip llVe ,c • ,he tratlc ' may choose lo sell rather than buv ihe risky asset. 
n ,„ , , ra ' SC f ?" , " 1 P ortam - bllt '<'b(le, modeling issue, if the trader acts as if the price 
Hle lh * or,nat,< ’ n lh <- specal.st has, .hen (he socialist may ei,he[ no. 

ga le the price or garble it in a way oihcr than the trader is assuming To avoid this 
inconsistency, we make one of two analyt.cally equivalent assumptions. (1) here ex.sts a 
precomnmmem mechan.sm that garbles the price, such as the stat.c in the telephone 
line the specialist agrees to use in communicating the price to the trader, or (2) ihe 

tfoMr’dr” | )f thc ga h rb l! ng dls,n,IUI, »" dhe expression V in our model) are revealed to 
the trader along with the noisy price. 
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specialist is risk neutral: that is, he has a utility for consuming an 
amount w represented by S(te) = w. The numeraire in the market is 
the price of a bond known to return one unit of the consumption 
good in the final period. The liquidating dividend on the risky asset is 
unknown until the final period and is represented by a random vari¬ 
able u whose realization is denoted by u. (Henceforth, a tilde will be 
used to distinguish a random variable from its realization.) The trader 
is endowed with B, riskless assets and x, risky assets; the specialist’s 
endowment is irrelevant (in this analysis) in the presence of his risk 
neutrality. The specialist and trader are also endowed with private 
information about the uncertain outcome u. Specifically, the specialist 
observes y, and the trader observes y„ where each of y, and y, com¬ 
municates the actual outcome u = u perturbed by some noise. 

We assume that in quoting a price to the trader, the specialist can 
precommit to garbling whatever value he selects as the appropriate 
rate of exchange for risky versus risk-free assets. Let P(x,, >,) represent 
the value the specialist selects as the exchange rate as a function of the 
trader’s endowment of the risky asset (which is common knowledge), 
x h and the private information observed by the specialist, y,. Let 8 
represent the garbling, or noise, that is added to P(-). That is, the 
specialist selects P(x„y ,), but the price quoted to the trader is P(x„y ,) + 
8. Finally, let R(x„ B„ y„ p) represent the trader's excess demand for 
the risky asset, as a function of his endowment of the risky and risk¬ 
free assets, x, and B„ respectively, his private information, y„ and the 
price of the risky asset, p. This allows us to define expected utility 
functions y, and y„ for the specialist and trader, respectively, by: 

y-[x„ y„ £!/?(•)] = E[(p + 8 - u)R(x„ B„ y„ p + 8)If, = ¥,], 

T[X/, y„ p, r\P(-) + 8] = 

E\U[(u - p)(r + x,) + px, + B,}\P{x„y s ) + 8 = p, y, = v,|. 

where U(u>) is the trader’s utility for consuming an amount w. If, at p, 
the expression R(-) is positive, the specialist “covers" the trader bv 
issuing shares against his own account of the risky asset sufficient to 
satisfy the excess demand; if the expression is negative, the specialist 
"eats" the excess supply by making it part of his personal portfolio. 

1 his allows us to define an equilibrium to a market in which the 
price is set by a maximizing monopolist with private information as a 
pair of functions P( ) and R(-) such that 

P(x„ >.,) = arginax y,[x„ y„ />!/?(•)]. 

B(. y t , p) - argmax y,[x,, B h y„ p, rlP( ) + 8j. 

Essentially, /?(•) is the set of arguments for r that maximizes the trad¬ 
er’s expected utility in response to any price p he is quoted, given his 
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conjecture about the price rule used by the specialist. P{-) is the set of 
arguments for p that represents the specialist’s best response to the 
trader’s anticipated demand, given the private information the spe¬ 
cialist observes. In this equilibrium, expectations are rational, or 
fulfilled, because the conditional expectation operators defining y s 
and 7, are the correct ones based on the underlying joint distribution 
of it, y„ y„ and 8. 


III. An Illustration of an Equilibrium 

lo illustrate the nature of an equilibrium to the model we propose, 
we introduce three assumptions that facilitate the analysis. First, we 
assume that the random vector (u, y„ y h 8) has a four-variate normal 
distribution with the mean (y„, yo. yo, 0) and covariance matrix 


M 

5 , 

8 



//-' 

h 1 + f 1 

r 1 
0 


h 0 

/,-> 0 

h~'+g-' 0 

0 v 


The expression h, which is the precision of the unconditional distribu¬ 
tion of ii, can be thought of informally as a measure of the amount of 
common knowledge about it. The expressions/ and g can be inter¬ 
preted as measures of the amount of private information held by the 
specialist and trader, respectively : for example, / = 0 implies that 
the specialist has no information and / —* * implies that he knows the 
realization u of u with certainty. The expression V is the level of noise, 
or garbling, the specialist precommits to including in any price the 
trader is quoted. 

Second, the tiader has a (negative! exponential utility function with 
constant absolute risk tolerance of one: that is, he has a utility for 
consuming an amount w represented by U(u>) = - exj>( — w). This, in 
conjunction with all the assumptions taken together and especially 
normality, implies that the trader’s excess demand for the risky asset, 
/?(•), is given by 


K(*/. .■»/■ p) 


E[u\y, = y„ P(-) + 8 = p] - p 
varjuly, = y„ P(-) + 8 = p] 


Note that for this case the trader's excess demand for the risky asset is 
independent of his endowment of the risk-free asset. 6 


° The discussion in this section is couched in terms of one trader facing the specialist. 
The results are easily generalized to many traders assuming that the traders' excess 
demand function is linear in price and all traders observe the same information. The 
model could be further generalized such that traders observe different information. 



SPECIALIST PRICING 


7 


Finally, we assume that the trader always conjectures that the rule 
used by the specialist in selecting price has the special linear f orm P(x„ 
■y ) = ay s + bx, + c, where a, b, and c are real-valued constants. Essen¬ 
tially, the effect of these three assumptions is to permit an explicit 
solution to the equilibrium concept we propose by preserving a cer¬ 
tain linearity already inherent in the problem. 7 In restricting attention 
to linear objective functions, we are not modifying our definition of 
equilibrium, but instead are looking for equilibria of a particular 
kind. We leave as unresolved the question whether nonlinear-type 
assumptions will also yield explicit solutions. 

With regard to illustrating a solution to the special case we consider, 
our first result is to reduce the equilibrium concept to a more tractable 
form. Consider the specialist’s choice of the parameters a, b, and c in 
determining an expression for P(x t , y,). Let A (a) be a quadratic func¬ 
tion of a of the form A(a) = a' l (h + / + g) - af + Vf(h + g). 

Lemma 1 : The existence of an equilibrium is equivalent to the exis¬ 
tence of an a* that simultaneously satisfies 

[ 2 a(h + /) - f]A(a) - fg(Vf + a 2 ) = 0 ( 1 ) 

and 

A(a) > 0 . ( 2 ) 


Proof: See Appendix. 

The intuition behind lemma 1 is that the existence of an equilib¬ 
rium breaks down to two requirements: equation (1) ensures that the 
specialist’s choice of a* leads to a conjecture that is fulfilled on the part 
of the trader, while equation (2) ensures that the specialist’s choice of 
a* maximi/.es his expected utility (i.e., his objective function is globally 
concave with respect to a). The additional parameters b* and c* can be 
determined by substituting the value for a* into the functions b(a) and 
c(a) given by: 


b{a) = - 


JL 


2{a\h + f + g) + JV{h + g)] - fa 


r(a) = y ( )( 1 - a). (3) 


It is now possible to establish the existence of a unique equilibrium 
when V > 0,/> 0, and g > 0. 

Theorem 1 : There exists a unique equilibrium. 

Proof: First, observe that a* satisfies [2 a(k + f) — f\A(a) - fg{V( + 
a*) = 0 and A(a) > 0 if and only if it is a root to the third-order 


7 Specifically, we have already assumed that I he specialist is risk neulra) and lhai 
garbling requires adding a noise term to the specialist's selection of price. 
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polynomial \'la(h 4 /) - f]A(a) - fg(Vf 4 rc) in the region [f/2(h 4 
/), x). i hen, observe that this third-order polynomial is 

1. cleat ly negative at a = //2 (// 4 /); 

2. eventually positive as « becomes large, since the coefficient of a 1 is 
positive; and 

3. convex in the region 4 /), x). 

These three fails imply that the polynomial has a single (positive, 
real-valued) root in the region defined by [f/2(h 4 J), x). Therefore, 
there exists one, and only one, a* that both is a root to the polynomial 
expression and satisfies A{n) > 0. Q.E.I). 

Although the price the specialist quotes cannot he derived in a 
simple c losed loim, using lemma 1 straightforward expressions can 
be found for the three polar cases of T — 0, g = (), or r - q » j'| le case 
T = 0 is equivalent to assuming that the price the specialist selects is 
nevet masked. 

CIoroi t.AKv 1: When V = 0, a (unique) equilibrium takes the form: 


—L _ /,* = . 1 

h + / ’ h + / + 2ff’ 


h 

h 4/- V( >- 


Ttcoctt: Suppose T - 0. I his reduces the requirement for an equi 
librium m lemma I to finding an a* stub that 


\2<i(/i 4 /) -f].\{d) - fjgcr = 0, 

and 


( 4 ) 


M‘t) - tCtfl 4/4 g) - of > 0. 


It is a simple exercise to show that a (unicjne) a* that satisfies (4) and 
(a) is given by a* = //(/, 4 /), This is because (4) reduces to a quadratic 
tunc tion with two real-valued roots. only one of which (i.e., a = //[/; 4 
/ ]) implies A(a) > () Q.F..D. 

Corollary I implies that in the absence of noise, that is, V' = 0, the 
specialist has no truly pi hale information, since the trader can make 
unambiguous inferences about y, = y, on the basis of P = P. Alterna¬ 
tively, a tradct may have no private information whatsoever. This is 
equivalent to assuming g = 0. When the specialist has private infor¬ 
mation while the trader does not, an equilibrium does not exist unless 
noise passes beyond some threshold. 


2: Whcn K = a equilibrium exists only if V > 

jm.{h 4 /); in that event, it takes the form 


/ 


2(h 4 f) 


, b* — 


Hi 4 
Z h 


f 


\Vh{h 4 f) 


- S' - 

i* r s 


2 h 4 / 


2(A + /) 


;>»• 


T H l,0n 'u lom,d h > solv,n B die <ub,t equau™, imphc.t in the 
proof of theorem 1. However, this expression is not transparent. 
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Proof: Suppose g = 0. This reduces the requirement for an equi¬ 
librium in lemma 1 to finding an a* such that 

[2o(A + /)+/] A(a) = 0, (6) 


and 


A(a) = a J (h + f) — Vfh - af > 0. (7) 

Here, a unique a* that satisfies (6) is given by a* — fl2{h + /). How¬ 
ever, at that value, A(a) is positive only ij V > j!Ah(h + f). Thus, there 
exists no equilibrium when 0 < V < f!4h(h + /). Q.L.D. 

Corollary 2 has an immediate economic interpretation. Suppose 
that the trader conjectures that the price offered by the specialist 
contains information. 11 there is very little noise, that is, V ^ f!4h(h + 
/), the trader puts much weight on what price he is quoted as a signal 
of the realization ol the risky asset. For example, if a high price is 
quoted, the trader believes that the realization will be large and there¬ 
fore demands a good deal of the risky asset. Thus, independent of his 
private information, the specialist’s incentive is to select a high ex¬ 
change rate, since if he can sell even a little bit of the risky asset at a 
very high price, the specialist makes a big profit. But if the specialist 
selec ts a high exchange rate independent of what he knows, price can¬ 
not contain information, so the trader’s conjec ture is false. Suppose, 
on the other hand, that the trader conjectures that price contains no 
information (i.e., he ignores price as a source of information). Then 
the (expected-utility-maximizing) exchange rate the specialist selects 
will depend on his private information, so the trader’s conjecture is 
false once again. In short, there is no conjecture the trader can make 
that is fulfilled until noise passes beyond some threshold, which les¬ 
sens the weight he puts on price as a source of information. 

Corollary 2 is interesting also because of the discontinuity it sug¬ 
gests. Provided that the trader has some private information, that is,g 
> 0, no matter how small, an equilibrium exists independent of V. 
(For example, as shown in corollary 1, it exists even at l’ = 0.) The 
intuition here is that whenever the trader has some independent 
source of information to substantiate claims by the specialist, implicit 
in the price the trader is quoted, the specialist is kept in check and the 
equilibrium is preserved. 

Foi completeness, we consider the case in which the specialist has 
no private information whatever, that is, / = 0. This is equivalent to 
requiring that a* = 0 in the expression for price, since price cannot 
depend on information the specialist does not have. 

Corollary 3: When / = 0, a (unique) equilibrium takes the form 
a* = 0, b* = — [ \/2(h + gj], r* = y„. 
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Proof: Since the specialist has no private information, a* must be 
zero. When a* and/are zero, equations (1) and (2) are both implicitly 
satisfied (see the proof of lemma 1 in the App.). Substituting a* equal 
to zero into (he expressions for b and c in (3) yields the result (factor¬ 
ing out f in the former case). Q.E.D. 

Corollary 3 is a straightforward monopoly problem since the spe¬ 
cialist is endowed with no private information. For example, here the 
trader ignores price as a source of information, and thus his demand 
depends only on what he observes privately, y, = y,. 

IV. The Specialist’s Surplus 

The specialist’s surplus is defined as the difference between the spe¬ 
cialist's expected utility at an equilibrium and his expected utility at 
Ins autarky position.' 1 because the specialist is risk neutral, we assume 
without loss of generality that the specialist’s expected utility at his 
autarky position is zero. The purpose of this section is to explore how 
a spec ialist’s surplus is affected by (1) the amount of the specialist’s 
private information, as iepresented by/; (2) the amount of the trad¬ 
er's private information, as represented by g; and (3) the level of 
noise, as represented by V. The reason for examining this issue is that 
a spec ialist may be able to control or influence these parameters (espe¬ 
cially / and V); therefore, it is worth considering his incentives to do 
so. For example, a spec ialist may be able to control the amount of 
private information he acquires; or. if V is thought of informally as 
the level of static in the telephone line over which he communicates 
the price of a risky asset, the specialist may be able to set the level of 
static to suit his purposes. To address this question, we first need an 
expression for the specialist’s surplus. 

Theorem 2: The specialist’s surplus is determined by substituting 
the a* that satisfies equations (1) and (2) in lemma 1 into the expres¬ 
sion 

^ +fY)[a i (h + f + g) + fV(h + g) - fa} } 

T [2« 2 </* + / + g) + 2 fV(h + g) - fa f 

+ [ g2 (^ ' + / ‘) ~ V][a l (b + / + g) -f fV{h + g) - fa] _ , 

«’•’ + fv gfl ' 

Proof; See Appendix. 

I he usefulness of theorem 2 is limited because for the general case 
of/, g, and V, each positive, the parameter a* cannot be expressed in a 

J Other definitions of the specialist’s surplus might be used depending on what one 
assumes about the specialist’s alternatives. In this definition, we implicitly assume that 
the specialist’s alternative is not to trade at all (autarky). 
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simple closed form. As a way of approaching the general case, we first 
consider, separately, the two polar cases in which (1) the specialist has 
no private information while the trader does (i.e.,/ = 0, g > 0); and 
(2) the trader has no private information while the specialist does (i.e., 
f > 0, g = 0). Here, simple closed-form expressions for a* can be 
derived (see corollaries 2 and 3 above). For each of these cases, a 
recurrent question is what level of noise V maximizes the specialist’s 
surplus. 

The case where the specialist has no private information and the 
trader does (i.e.,/ = 0 , g > 0 ) helps to illuminate certain aspects of the 
model. One might suspect that the specialist would be at such an 
enormous disadvantage relative to the trader in this case that he 
would stay at his autarky position rather than trade at all. When f = 0, 
the specialist’s surplus is + g)] - (h + g)V — gh~\ It follows 

immediately from this equation that any positive value of V makes the 
specialist worse off than when V = 0. This result is actually quite 
plausible; noise interferes only with the specialist’s ability to set the 
optimal price and cannot dilute the trader’s information, because 
when / = 0 the trader ignores price as a source of information any¬ 
way. When V = 0, the specialist’s surplus is [x?/4 (h + g)l - gh 1 . This 
surplus is clearly decreasing in g, so the specialist is indeed worse of f 
the better the quality of information available to the tr ader. Ho wever, 
if g is small enough, in particular when g < (-h + V / 2 +~kx$)/2, the 
specialist’s surplus will be positive and he will prefer to trade despite 
his severe information disadvantage. 

This case illustrates the importance in this analysis of our assump¬ 
tion that the specialist is more risk tolerant than the trader. The 
trader’s greater aversion to risk relative to the specialist means that 
the trader is willing to trade to get rid of some of his risky endowment 
even though the specialist sets the price of the risky asset below its 
expected value. This is analogous to the standard monopoly problem 
in which the seller gains by selling insurance at a monopoly price but 
loses because of inside information. 

When the trader has no information and the specialist does, that is, 
f > 0, g = 0, theorem 2 implies that the specialist’s surplus reduces to 

_ xYlWjh+jn + 1] _ [f - 4Vh(h + f)f 
4A 64V*A a (A + // 4h[f + 4V(h + f )T 

Casual inspection of equation (8) suggests that for any / > 0, there 
exists some V in the interior of [f/4h(h + /), ») that maximizes the 
specialist’s surplus (where we consider only that region over which an 
equilibrium exists, i.e., V > f/4h[h + f]\ see corollary 2). On closer 
inspection, however, we observe that no pair of/ > 0 and V > j!4h{h 
+ f) dominates / = 0 and V = 0! We state this as a corollary. 
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Corollary 4: Provided that the trader has no information, the 
specialist has no incentive either to become informed or to introduce 
noise, when his alternative is a level of information and noise in a 
region over which an equilibrium exists. 

Proof: Under the restriction V > j!4h(h + /), the expression in 
equation ( 8 ) is (strictly) decreasing in /. Therefore, substituting the 
value V ~ f/‘ih(h + /) into equation (8) achieves an upper bound on 
the specialist's sur plus: namely, zero. However, when f = 0 and V = 
0, the expression in equation (8) assumes the value x^J4k. Since this 
latter value is clearly higher, our claim is demonstrated. Q.E.D. 

The conditioning statement in corollary 4, namely, that the alterna¬ 
tive to / and V equal to zero is some level at which an equilibrium 
exists, is based on technical rather than economic considerations. 
There is no endogenous motivation m the model for the specialist to 
guarantee an equilibrium when the trader has no information . 10 

Corollary 4 is usef ul for contrasting the polar case of g = 0 with the 
more general case of g > 0. The intuition underlying corollary 4 is 
that the specialist has no particular need for private information 
other than to inform him about the trader’s excess demand function: 
if the trader has no private information, his excess demand function 
is known to the specialist with certainty. When the trader has some 
private information, it is generally the case that it is optimal for the 
specialist to be informed as well. The specialist profits from acquiring 
private information since this information tells him something about 
rhe excess demand to be submitted to him on the basis of the price he 
quotes. 

It is also interesting to note from corollary 4 that/ = 0, g = 0, and V' 
= 0 is the symmetric information case, and here the specialist’s sur¬ 
plus is xf/4h. Clearly, the specialist’s surplus decreases as h, the preci¬ 
sion of the unknown liquidating value of the risky asset, increases. 
This suggests that as more information commonly know r n to the spe¬ 
cialist and trader is made available, the rent earned by the specialist 
drops, fhis may help to explain the substantial unexpected drop of 
the price of a seat on the New York Stock Exchange in March 1934, 
the month when the Securities and Exchange Act of 1934 was in¬ 
troduced into Congress. If the Act had the effect of increasing infor¬ 
mation to traders, the seat price drop would be a reflection of the loss 

As discussed m Sec III, when a trader has no information and attempts to use 
price as a source of information, the specialist’s incentive is to choose an arbitrarily 
large price, however, it is not clear that this is economically meaningful in the absence 
of some constraint that ensures an equilibrium If a regulatory authority prohibits the 
specialist from having private information (e g., no "inside information”), then an 
equilibrium will exist and ihe specialist will choose V' = 0, even if the trader has no 
information. In this context, the regulatory prohibition on the specialist’s having infor¬ 
mation might be thought of as a means of eliminating sjjeculative bubbles 
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of specialist surplus. This point is more difficult to make in the more 
general expression that arises for the specialist’s surplus in the diverse 
case (see theorem 2). Visual evidence of the seat price drop in March 
1934 can be found in Schwert (J977). 

With regard to an optimal level of noise V, we have shown so far 
that a specialist’s surplus is maximized at V = 0 when either/ = 0 or g 
= 0. Therefore, in these cases, garbling does the specialist no good. 
For each fixed and positive pair of / and g, there generally exists a 
positive, finite level of noise V that maximizes the specialist’s surplus. 11 
On the one hand, the specialist profits from noise since this masks his 
private information from the trader. On the other hand, beyond a 
certain threshold, too much noise interferes with the specialist’s ability 
to optimize. 


V. Consideration of Assumptions 

In summarizing our analysis, we consider the realism of our model in 
relation to how a specialist is commonly thought to operate, file first 
question concerns whether specialists are monopolists. Clearly, there 
is no prohibition on one trader’s exchanging shares of a risky asset 
with some other trader and thereby circumventing the specialist. 
However, in the presence of large transaction costs associated with 
one trader’s searching out someone else with whom to exchange as¬ 
sets at a mutually agreeable price, the idea that the trader will deal 
exclusively with the specialist is not unreasonable. Although these 
search costs are only implied in our model, when they are explicitly 
considered, the specialist may have the latitude to act as a monopolist. 

The second question concerns the fact that price is not contingent 
on the quantity of shares exchanged. This has two implications: the 
specialist is put at some disadvantage in that he cannot use price as a 
barrier against a trader’s private information, and the trader himself 
behaves as a pure price taker in that he believes his demand has no 
influence on price. With regard to the first implication, it is more 
reasonable to imagine that the specialist does make price contingent 
on the quantity traded. This would be a very simple mechanism to 


11 To show that there are situations in which the specialist prefers a positive, hmte 
level ot V, consider the case in which f = g = A = I Here, we evaluate the specialist's 
surplus at V’ =*= 0. V ~ 2, and as V approaches infinity. When V = 0, the specialist's 
surplus is 0.0625x7 - 0.5 When V = 2, an equilibrium exists with a = 0.3823 and the 
specialist’s surplus can be written as 0.1206x, — 4 2276 As V’ ajrproaches infinity, n 
approaches 0.375, and the specialist's surplus approaches 0.125*7 - ^V r - 1 It is a 
simple exercise to show that when x, > 8 01, the selection of 1=2 dominates either l 
zero or V infinite. Since there exists some finite, positive level of V’ (i e , V = 2) that 
yields a higher surplus than either polar extreme, the optimal level ol V (assuming the 
specialists can freely select it) must be finite and positive. 
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guard against traders with superior information (e.g., inside infor¬ 
mation") exploiting their knowledge. To that extent, our model sug¬ 
gests a limiting case to the more general circumstance in which prices 
are influenced by demand. 

The second implication is, perhaps, more profound in that if trad¬ 
ers are aware that their demand affects price, then they will behave 
strategically and not competitively. This fundamentally changes our 
analysis in that om primary objective is to integrate the role of the 
spet ialist without departing significantly from the competitive behav¬ 
ior exhibited by traders in the presence of a Walrasian auctioneer. We 
also tetnark that the large information requirements necessary for 
traders to act strategically suggest that competitive behavior is a more 
realistic approximation of an actual market setting. 

The final question concerns whether specialists stay in business be¬ 
cause they have a comparative advantage at bearing risk. (Looked at 
somewhat differently, the question could be rephrased to ask whether 
it would evolve naturally that the specialist would be the most risk- 
tolerant individual within a community of traders.) In our model, the 
specialist's greater tolerance for risk vis-ct-vis the trader is key because 
it permits the specialist to achieve a positive surplus even in the pres¬ 
ence of unfavorable information asymmetries. It can be argued, how¬ 
ever, that the positive surplus that results from increased risk toler¬ 
ance may proxy tor the variety of institutional frictions (which we 
ignore) through which a specialist profits: transaction charges, a bid- 
ask spread, a price contingent on the quantity traded, etc. In other 
words, the specialist may indeed have no comparative advantage at 
hearing risk. However, the peripheral ways in which he earns rents by 
performing a specialist’s tasks may cause him to behave as if he is 
mote t isk tolerant or at least ensure him a positive surplus even in the 
presence of better-informed traders. 


VI. Conclusion 

I his paper is motivated by what we perceive to be a gap in the litera¬ 
ture on the determination of prices in rational expectations equilibria. 
Much of the research on rational expectations assumes the existence 
of a neoclassical Walrasian auctioneer who clears the market but is 
exogenous to the model. The Walrasian auctioneer assumption, 
t ough useful and powerful in many situations, is in a fundamental 
sense inconsistent with the spirit of rational expectations models, es¬ 
pecially when information is not symmetrically distributed among 
market participants. 

Our work yields a number of results of theoretical and empirical 
interest, the main points of which are summarized here. 



SPECIALIST PRICING 

i) Equilibria exist under a broad set of conditions but not under all 
conditions. In particular, when the trader has no information and the 
specialist does, there is no equilibrium unless the noise in the price 
system is sufficiently large. When noise is small, a trader who attempts 
to use price as a source of information puts so much weight on the 
information contained in the price that the specialist has incentive to 
raise the price without bound. 

ii) Assuming the trader has no information and that the specialist is 
forced to choose the level of noise sufficiently large to ensure an 
equilibrium, the specialist has no incentive either to acquire informa¬ 
tion or to introduce noise. Because the specialist is risk neutral, the 
only reason he is ever interested in acquiring information is to obtain 
a better estimate of the trader’s demand function before setting price. 
When the trader has no information his demand function is perfectly 
predictable, so the specialist gains nothing from acquiring informa¬ 
tion. The optimal price is nonstochastic when the trader has no infor¬ 
mation, and it is only to the specialist’s disadvantage to add noise. 

iii) Even when the specialist is at an extreme information disadvan¬ 
tage relative to the trader (i.e., when the specialist has no private 
information but the trader does), the specialist may still prefer trad¬ 
ing to autarky. This is because the specialist may be able to exploit the 
risk aversion of the trader sufficiently to offset the trader s informa¬ 
tion advantage. 

iv) If both the specialist and the trader possess private information, 
the specialist may gain by adding a finite amount of noise to his price 
quotation. In this situation, the noise added to the price garbles the 
information transferred to the trader while forcing the specialist to 
relinquish some control over the price at which trading takes place. 

Despite the restrictiveness of certain of our assumptions, we believe 
that our analysis has the advantage that it approaches the problem of 
modeling price formation from the correct methodological perspec¬ 
tive. Specifically, the specialist’s private incentives, in conjunction with 
the rational expectations of traders, are considered in providing a 
rationale for the information content of prices. 


Appendix 

Proof of Lemma I 

The proof of lemma 1 is in three parts. First, we determine the trader's 
optimal response to the offering of a price p for the risky asset by computing 
his excess demand as a f unction of his conjecture about the price rule used In 
the specialist. Second, in part ii, we determine an expression for the exchange 
rale of the risky asset that maximizes the socialist's expected utility in antici¬ 
pation of the trader's optimal response, and the information the specialist 
observes. Finally, in part iii, we show how the requirement that the tradei 's 
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conjecture about the pi lie rule be rorrert leads to the two expressions in 
lemma 1. 

i) Suppose that die trader conjectures that the price rule has the following 
linear form: P + f> ~ ny, + /w, + e + 5. This implies that the random 
variables (u, y„ P + 8) have a invanate normal distribution with mean (>a, y«, 
«y 0 + bx, + z) and covariance matrix: 

u I h 1 h ~ 1 uli 1 \ 

V, U ' A 1 + g 1 ah 1 

P + 8 \ ah 1 uh ~ 1 a~(/i ~ 1 + / ') + f • 


1 hen the trader 's optimal response to the quotation of a price p for the risky 
asset is an excess demand function ol the hum 


K{x„ V„ p) 


- y„ P (■) + 8 "/>) — /' 
varji/iv, = v ( , /’(•) + 5 = pi 


x„ 


uheie Ejulf, = v,. /*(•) + 8 = pi and v.«r[«ly, = y„ /’( ) + & =./>l are the mean 
and variance ol it conditional on obsei\ing v, = yyaiid/ J (-) + 8 = p. 1 he mean 
and variance can he expressed as 

/•:[iiif, - yy, P( ) + h = p] - Vo + gk t {y, - Vo) + kAp - ay „ - bx, - c) 


and 

\ai(/iiv, " v„ /’(■) +8 - //| = A’|, 

uheie 

ki _ __ 

' «-(/« + / + g) + /1 Vi + ,lf) 


"'(A + /+£) + /V(A + tff 

u) Ihe specialist observc’s the realt/.ition y, =- y\ and then selects an ex- 
c hange tale loi die risks asset that maximizes his expected utility in anticipa¬ 
tion ol the trade, s optimal response. 1 his is equivalent to choosing a p* that 
solves 


max l-Ku - p - 8)|-ft(x„ v„ p + 8)]l>\ = yj. 
!• 


1 he use of the calc ulus indicates that the hrst-ordei condition lot a maximum 
is satisfied at p*, where 


p* = /•-(lily, = y,) + 


E\H(x,. y„ />* + 5)ly, ~ y,] 

Ai'd - * 2 ) 


provided that &, < 1, thereby ensuring that the maximization problem is 
concave m p 

iii) The requirement k’> < 1 becomes a requirement lor an equilibrium, 
since in its absence, that is, k~, a I, the specialist maximizes his expected utility 
by selecting p* infinite, independent of his private information. But if p* is 
always infinite, it does not depend ony, = y, since it is invariant to the informa- 
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8l 


tion observed by the specialist. Therefore, any conjecture the trader makes 
about price’s being a linear function of the specialist's private information 
cannot be fulfilled in the absence of A x < 1. Note further that 

E(u\% = >,) = y„ + - A y, - y 0 ), 

« + f 

E[R(x„ y„ p + t)\y, = y,j 


k\ '[>„ + gkiE(y, - y„ly,) + k.,(p - uy» - bx, - r) - p] - x, 


>o + gh\ 


X 


h +/' 


l) - 


<y, - y<\) ~ k‘A<iy„ + bx, + r) - (1 - k 2 )p 


['his implies that p* reduces to 



k,x, kAayn + bx, + () 

2(1 ~ A*) 2(1 - k 2 ) 


However, for the trader’s conjecture to lie fulfilled, it must be that E(p* + S) 
= E(ay, + bx, + c + 8) = ay„ + bx, + r. This requirement reduces p* further 
to 



l lterefore, an equilibrium is a triplet of constants: 



However, k t and k t are functions ol a. Substituting in the expressions for Aj 
and ft 2 , a solution to equation (Al) requires that a be a real-valued root to the 
third-order polynomial equation 


(A2) 

l2a(A + /) - j][d 2 (h + f + g) - aj + VJ(h + g)] - fg{Vj + a' 2 ) = 0. 
Furthermore, the requirement k 2 < 1 is equivalent to 

a 2 (h +/ + «•)"«/+ V/(A + g) > 0. (AS) 


Q E.D. 


Proof to Theorem 2 

theorem 2 is simply a computation of the difference between the specialist's 
expected utility at an equilibrium and the specialist's autarky point. First, note 
that P can be expressed as P - a(y, - y„) + y„ + bx, + S, and /?(-) as R[x,. y,,p 
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?6) = ft, V*i(> - v„) - fe, - (• - - >») + S] > - *'• Thus ’ the 

spec idlist’s surplus is 

/^j Kn b ([u - fl (v, - Vo) - yo - l>x, - S]| -R[x,.y„ a(y, - y 0 ) 

+ vo + bx, + 8]|l>i = yj] . 

- -K' - k, k ([(H ' Vo) - «<v, - Vo) - bx, - 8} 

. giMv, - Vo) - bx, - (I - **)[«<* - Vo) + 8])lv, = y,)j 

: + k,'b) + ngh~ 1 + a(\ - ft 2 )fti I 'A' ' - « 2 (1 - *a)*r ’(A 1 +/"') 

- (1 - AsWti 'V - ' 

= - xjb( I + A, '*) + fl 2 (l - A 2 )A('(^ ‘ +y ')-(*- kt)*t l V - gb~'’ 


uhcic iho last equality follows fiom the relation for an equilibrium required 
m equation (A2). Substituting back in the expressions for ft, and ft 2 (see the 
proof to lemma 1) yields the results. Q.K.l). 
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On the Economics of Compliance 
with the Minimum Wage Law 


Yang-Ming Chang 

Kumm S(afr L’wtVf’tstty 


Isaac Ehrlich 

Om'iMrfv «/ AV'ii 1 «/ Hii/l/ilo 


Hus pa pci 1 ©examines die issues ol compliance' with and enforce¬ 
ment ol ihe minimum wage law recently addressed in this Journal by 
Ashenfeltei and Smith and by Gieinci. I’m suing a more rigorous 
methodology we ate able to add new general conclusions, and tor- 
i ei t and lecont lie some pies ions son Hu ling coriclusions concerning 
die mle of the disparity between the minimum and hee maiket 
wages, the level and elasticity of labor demand, and the magnitude 
ol deiemng monetary sanctions on the mincompliance decision. 
Out formulation also addresses the law evasion (teduced wages) as 
well as the law avoidance (modified employment) aspre ts of the non- 
compliance decision, which pievious foimulations have ignored. 


Recent articles in this Journal have dealt with the- issues of compliance 
with and enforcement of the minimum wage provisions of the Fair 
Labor Standards Ac t (FI.SA). Ashenfelter and Smith (1979; hence¬ 
forth AS), analyzing the determinants of ncmcompliance behavior by 
firms, concluded that “the incentive to comply is lower; (a) the lower is 
the market wage below the minimum wage, and ( b) the larger is the 
elasticity of demand for labor (in absolute value)” (p. 336). In a com¬ 
ment on AS, Grenier (1982; henceforth G) treated the prospective 
penalty for noncompliance as a function of the difference between 

We are indebted to fames | Heckman for valuable comments on an earlier draft. 
[Journal of f*oUUral t nrnomy, JfWV v<»| no ]| 
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the statutory minimum and the free market wage, unlike AS, who 
had treated it as a fixed sanction, and he concluded that “the incentive 
to comply is lower: (a) the closer is the minimum wage to the market 
wage, and (b) the smaller is the elasticity of demand for labor” (G, p. 
186). G observed that “this result is totally contrary to the result ob¬ 
tained by AS,” and he ascribed the discrepancy between the two anal¬ 
yses to the alternative penalty structures assumed. Also, contrary to 
AS’s analysis concerning an efficient enforcement mechanism, G sug¬ 
gested that “the requirement that a noncomplying employer pay a 
fraction of the difference between the two wages to his employees 
constitutes a real penalty” (G, p. 184) and that consequently his analy¬ 
sis provided a rationale for observed enforcement practices. 

In this article we show that both AS’s and G’s analyses are only 
partially correct. We propose the following: (1) A sanction based on 
the requirement that a noncomplying firm pay a fraction of the dif¬ 
ference between the statutory minimum and the market wage cannot 
constitute an effective deterrent on profit-maximizing firms. (2) If 
positive, the incentive for compliance is lower the lower the market 
wage below the minimum, regardless of the penalty structure. (3) For 
a given reduction in the market wage below the statutory minimum, 
the incentive for noncompliance is stronger (a) the larger the quantity 
of labor demanded and (b) the lower the market wage itself; further¬ 
more, (c) the increase in the incentive for noncompliance due to a 
lower market wage is greater the higher (in absolute value) the elastic¬ 
ity of demand for lalx>r. T hese propositions modify some erroneous 
inferences in both AS’s and G’s papers, which are due to a basic 
methodological shortcoming in their analyses. Our formulation is also 
more general in that it addresses the “law evasion” aspect (the effect 
on wages paid) as well as the "law avoidance” aspect (the effect on 
labor employment) of the noncompliance decision, w hich the previ¬ 
ous formulations have ignored. 

I. The Economics of Compliance 

Our model adopts the simple 1-period optimization framework for an 
expected-profit-maximizing firm also used hy AS and G. We assume 
that at the start of the representative period the firm is faced with the 
choice of paying either the statutory minimum wage M or the “free 
market” wage w. The capital market and the firm's product market 
are in competitive equilibrium with rental price of capital r and prod¬ 
uct price p. If the firm chooses to comply with the law, its maximized 
profits would be given by the indirect profit function ir(Af, r, p). Simi¬ 
larly, if the firm could pay the free market wage without any risk of 
being detected, its indirect profit function would he n(w, r, p). Since 
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the profit function is nonincreasing in wages and, by definition, M > 
7/', we have ir(w, r, p) - tt(M, r, p) > 0 as long as labor employment, 
7. (hi, r, p), is positive. 

The prospective decline in profits given compliance with the 
minimum wage law generates an intrinsic incentive for noncom¬ 
pliance. Under effective enforcement of the law, however, evasion of 
the law is punishable by a legal sanction. Assume that with probability 
\ of being detected and convicted the firm is required to pay back for 
each unit of labor a positive multiple k > 0 of the difference between 
the legal minimum wage and the market wage; that is, the actual fine 
is F = k{M - u<)[„ The expected profit of a noncomplying firm to be 
maximized is then given by 

E{ TT) = (I - k)(/>/(/., K) - U)L - rK\ + \[pf(JL, K) - wL 

- tK - k{M - u’)L\ (1) 

= pf(L, K) - [u< + kk(M - w)\l. - rK, 

and the indirect (expected) profit function becomes 

Tr[7t> + \k(M — u>), r, p] = ir[£ (u>), r, p], (2) 

where /•, («’) — tc + \k(M - w) represents the expected wage rate in 
the case of noncompliance. Equation (2) recognizes implicitly the de¬ 
pendence of the violating firm’s labor demand on the prospective 
legal sanction for noncompliance L - L[E{w), r,p], which, by raising 
the marginal cost of labor, acts as a “deterrent” to labor employment. 
This formulation of the problem accounts for the wage evasion as well 
as the employment avoidance implication of the noncompliance deci¬ 
sion by the firm, which both AS’s and G’s formulations generally 
ignore. 

Under profit-maximizing behavior the incentive for noncom¬ 
pliance can be measured by the magnitude of the excess profit from 
noncomphance: 

V = v[E(w), r, p\ - -n (M, r, p). (3) 

In the absence of any additional costs or benefits to employers from 
noncompliance, the decision whether to comply with the law would 
depend strictly on the sign of V. If firms incur, however, some addi¬ 
tional “fixed” costs (I)) beyond the prescribed legal sanctions in the 
form of loss of federal contracts or public “good will," and if these 
costs vary in magnitude across firms, the actual frequency of viola¬ 
tions of the ELSA law would be a monotonically increasing function 
of the excess profit from noncompliance. This analysis leads to the 
following propositions: 
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: Proposition 1: (a) A minimum wage enforcement policy requiring 

the violating firm to pay only a fraction of the difference between the 
statutory minimum and the market wage per unit labor will not con¬ 
stitute an effective deterrent, (b) The incentive for noncompliance 
would be eliminated, in contrast, if the penalty rate, k, were deter¬ 
mined at a level sufficiently high to make the expected wage rate for 
| the violating firm higher than the minimum wage, (c) Regardless of 
the structure of the legal sanction imposed, whether fixed or propor¬ 
tional to L(M - w), the incentive for noncompliance, if positive, will 
be greater the lower is the market wage below the statutory minimum. 

The proof of this proposition follows from the well-known prop¬ 
erty of the profit function, which is decreasing in wages as long as 
employment is positive. Clearly, if k s 1, then \k 1 as well, since 0 s 
X «s 1, Thus, E(w) = w + \k(M - w) =£ Af, and V = ir[E(w), r, p\ - 
rr(Af, r, p) 3= 0, which proves proposition la. A fortiori, if the penalty 
rate were set at a critical level above unity k > 1/X, £(ie) would exceed 
M and V would become negative, in which case the incentive for 
noncompliance would be entirely eliminated. This proves proposition 
\b. And the proof of proposition lc follows from the fart that if V > 
0—that is, E(w) < M, or {1 - \k) > 0—then 

_dV_ = dir [EM , r , p\ dE(wl = r p j (] _ X *) < (), (4) 

dw dE(w) dw 

since by Hotelling’s lemma thr[E(u>), r, p]ldE(w) = -l.[E(w), 1 , p]. As 
long as (1 - Kk) > 0 , therefore, a reduction in the free market wage 
below the statutory minimum will increase the incentive for noncom- 
pliance . 1 And the same result would follow if the penalty structure for 
noncompliance includes a fixed cost, D, or consists only of such cost. 

Proposition 2: For a given reduction in ihe free market wage rate 
below the statutory minimum, the incentive lor noncompliance is 
stronger (a) the larger the quantity of labor demanded at the effective 
(expected) wage rate and (b) the lower the market wage itself; in 
addition (c) the increase in the incentive for noncompliance clue to a 


1 h is interesting to note that unlike a reduction in in. an increase in Af does not 
necessarily raise the incentive for noncouipliance since 
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_ dn(£(u'), r,/>| 

dE{w) 


X* 


dir(Af, r. p) 

-= /.(Af. r. ft) - 


flAf 


. p) - Xi/.|£(u'). r. p ). 


where, by Hotelling’s lemma. L denotes the quantity demanded of labor at the alterna¬ 
tive wage levels, i-lole that 1 > /.(M, r. p)IL[E{w). i. p\ a- F(w)/M > X* if ? s I. 
where ? denotes the arc elasticity of demand for labor between E{w) and M Thus, a 
differentially higher minimum wage would unambiguously raise the incentive for non- 
compliance (i.e., SV/dM > 0) only if it did not lead to a reduction in the (optimal) wage 
hill of the firm upon compliance relative to nomompliance with the law. 
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lower free market wage is greater the higher (in absolute value) the 
relevant point elasticity of demand for labor. 

The proof of proposition 2a follows immediately from equation (4) 

since 


a_ 

<)L 




kk) > 0. (5) 


Proposition 2b similarly follows since 


.TV 

rh i'~ 


cl/.f/Ttc), r, p] 
dE(w) 


(1 - \kf 




«i) 


where 

_ ( <■'>/,[£(»’), >■ pi . E(w) 
l ’ ~ ( M(u<) ' L\E(w).r.p\ 

And equation (6) implies, in turn, that 



which proves proposition 2<. The economic rationale behind proposi¬ 
tion 2 is straightforward: for any given discrepancy between the free 
market wage and the minimum wage, the incentive for noncom- 
pliance (if positive) is greater the larger the quantity of labor de¬ 
manded at the effec tive (expected) wage rate. And the more elastic 
tlie labor demand tuive about the expected wage rate, the greater 
would be the increase in the incentive for noncompliance as the free 
market wage rate declines, since the increase in employment then 
would be greater. 

Whence the difference between our propositions and those of AS 
and C»? AS measure the monetary inc entive for noncompliance, using 
out terminology, by the difletence 


V — (t — X)f-ir(ici, r, p) — ix(M, r, p)] — KD , (8) 

where D denotes a "fixed” sanction level. They then proceed to re¬ 
write equation (8) after taking a second-order Taylor expansion of 
the profit functions about (a\ r, p). Such an expansion should have 
resulted in 


N(1 


X) 


L(u\ r, p)(M - xt') 



AD. 


(8a) 
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where e * — [dL(w, r, p)ldw][w/L(w, r, p)]. Unfortunately, AS commit a 
computational error (they apparently confuse the algebraic and the 
absolute value of e) and reverse the sign of the second term on the 
right-hand side of equation (8a) (see their eq. [2] on p. 336). They 
then conclude that “the incentive to comply is lower . . . the larger is 
the elasticity of demand for labor (in absolute value),” and they go on 
to suggest that “firms . . . for which wage changes produce large 
employment adjustments have the greatest incentives to violate the 
law” (p- 336). Equation (8a), if valid, produces, however, just the 
converse inference since it implies that dV/de < 0. The reader may also 
note that although AS conclude intuitively, and correctly, that the 
incentive for noncompliance is higher the lower is w relative to M, as 
our proposition lc implies, this conclusion cannot be defended sys¬ 
tematically from equation (8a), which indicates that the effect of a 
decrease in w on V is ambiguous. 

Unlike AS, G assumes that the monetary punishment for noncom¬ 
pliance is imposed as the difference between M and u> per unit of 
labor, which amounts to the requirement of a full batk payment pol¬ 
icy with no additional sanction (k = 1). His version of equation (8) is, 
then. 


V = -ir (w, r, p) — r, p){M — w) - it (M, r, p), ( 9) 

which he also rewrites using a second-order Taylor expansion about 
{u>, r, p) as 


V — L(u\ r, p)(M - w)l 


1 — X — V- 2 e(w, r, p) 


M - w 


m 


with c = — (dlJdw)(u>/L). From this equation (which is equivalent to eq. 
(6] in G’s paper, p. 186), G concludes that “if fated with the choice of 
paying the market wage or the minimum wage, the firm will have 
more incentive to pay the minimum wage it the difference between 
the two wages is high (our italics]”—a result that is contradicted by our 
proposition lc. And he also concludes that to be effective as a deter¬ 
rent it is not necessary that the sanction for noncompliance with the 
minimum wage law involve a complete back payment of the differ¬ 
ence (A-f - w) per unit of labor, since, even if the proportion of the 
hack pay k is less than unity, equation (9a) could have a negative sign 
provided that (1 — X) < r, p)[(M ~ u')/u’]—a result that is 

inconsistent with our proposition la as well as with AS's analysis (see 
p. 337 of their paper). 

I he erroneous conclusions in G's paper stem from two basic 
methodological shortcomings that are also implicit in AS's formula¬ 
tion: First, both formulations ignore the “employment effect" of the 
noncompliance decision due to the expected fine, which raises the 
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effective marginal cost of labor services. Put differently, equations (8) 
and (9) do not rely on the relevant (optimized) profit function of the 
noncomplying firm, which is introduced in equation ( 2 ) of this paper. 
Second, both formulations attempt to derive inferences about the 
influence of the minimum wage level and structure of the legal pen¬ 
alty on noncompliance, as specified in equations ( 8 a) and (9a) above. 
The problem with these approximations, however, is that they are 
valid only for quadratic profit functions (i.e., linear demand curves 
for labor) and for small increments in the minimum wage M above 
the market wage w. Applying them to significant discrepancies be¬ 
tween M and w can easily lead to erroneous inferences, as the follow¬ 
ing illustration indicates: Let the probability of being detected and 
punished for noncompliance, X, be zero. Then by the theorem that 
the profit function is decreasing in the wage rate, we must have V = 
>. p) ~ ir( M, r, p) > 0. However, in C.’s analysis equation (9a) 
becomes in this case V = /,(«>, r, p){M - m){l - Vne(w, r, p)[(M - 
«')/«']}, which implies that the firm’s incentive for noncompliance 
would disappear once u> fell below M to a level where (M - w)/w > 2/e, 
a totally false inference. 

Furthermore, the conclusion that ceteris paribus (given u>, M, X, 
and k ) the incentive for noncompliance is higher the lower the elastic¬ 
ity of labor demand, which follows from equations ( 8 a) and ( 9 a) re¬ 
gardless of whether the monetary sanction is fixed or proportional to 
the difference L(M - m ), 2 is valid only for linear demand curves and 
for small increments in M or F.(u') above nc 3 

I bus, the only generally valid inferences regarding the role of 
iliai ket wages and the demand for labor services as determinants of 
noncompliance with the minimum wage law are those summarized in 
our propositions 1 and 2 . 

II. Some Lessons for Efficient 
Enforcement Policies 

lo the extent that minimum wage enforcement policies are designed 
to minimize the aggregate social cost of violations of this law, with the 
latter assumed to be a monotonically increasing and convex function 
of the frequency of violations, then an efficient enforcement agency, 

2 Grenier (p 186) attributes the difference between his and AS's conclusions regard- 
ing the role of the clastic ity of demand for laltor to the dillerent penalty structures 
assumed. 

By taking a second-order 1 aylor approximation ol eq (3) in our paper about (m, r, 
/>), we obtain a similar inference. Note that our proposition 2c shows that the differen¬ 
tially greater incentive for noncompliance generated by a more elastic demand for 
labor, given a decline in the market wage, is due to the interaction between the elasticity 
and the market wage level effects (see our eq. [7]> rather than the independent effect 
of t 
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by our analysis, should allocate a higher share of'the overall enforce¬ 
ment budget to law enforcement activities (site inspections, prosecu¬ 
tions, and trials) in industries and regions where the demand for 
labor earning subminimum wages is high and the average wage 
earned by low-paid workers is substantially lower than the statutory 
minimum wage imposed. This analysis may indeed explain why the 
U.S. government allocates almost half of its inspection efforts to the 
low-wage southern regions (see AS, p. 338), where the wage bill im¬ 
pact of higher minimum wages is the largest (see, e.g., U.S. Depart¬ 
ment of Labor 1974, p. 41; 1975, pp. 37-38). Our analysis also sug¬ 
gests, however, that to be efficacious, an enforcement policy cannot 
rely on a penalty scheme that requires employers to pay back merely a 
fraction of the difference between the minimum and the market wage 
per unit labor. If actual enforcement practices in fact relied on such a 
“penalty” scheme, as Grenier and Smith seem to claim (see G, p. 185, 
n. 1), then systematic variations in the (true) incidence of noncom¬ 
pliance across firms or regions would be essentially unrelated to direct 
enforcement efforts and could be explained, perhaps, largely as a 
result of indirect potential losses from noncompliance, such as those 
arising from losses of federal or state contracts or related governmen- 
tal subsidies. 

A related issue is the apparent reluctance of the federal minimum 
wage enforcement authority to impose high monetary fines on con¬ 
victed firms, although this would have been a far more efficient 
means of inducing compliance than devoting considerable resources 
to assure a high probability of detecting and convicting violators.' 

I his apparent “inefficiency" suggests that the actual minimum wage 
enforcement policy of the government cannot be understood solely in 
terms of the achievement of monetary efficiency in enforcement ef¬ 
forts (for a survey of alternative social welfare criteria, see Ehrlich 
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Using dailv prices of indexed bonds between 1970 and 1979, we test 
whethei announeements of the Isiaeli CPI eontaiu infoimalion that 
is not already reflected in bond prices. The results indit ate that bond 
prices reflet t about 85 percent of the new information about infla¬ 
tion as it occurs (i.e , when the Central Bureau of Statistics samples 
prices). T he announcement of the CPI 15 flays after the end of the 
sampling period causes the remaining 15 parent adjustment in 
bond prices. This evidence raises questions about the empirical im¬ 
portance of misperceptions about inflation as a source of nonneu¬ 
trality in monetary polity. 


I. Introduction 

T his paper examines the timing of the reaction of indexed bond 
prices to the occurrence and subsequent announcement of inflation. 

This work has evolved troin an earlier paper by Huberman entitled “1 he Informa¬ 
tional Efficiency of the Israeli Bond Market " Previous drafts oi this paper were entitled 
'Inflation and the Returns to Indexed Bonds ” I imothy Thompson, Patritia O'Brien, 
and Ralph Sanders provided computing assistance Baruch lev ptovided access to the 
data tm bond prices. Robert Barro, Douglas Diamond. Eugene Fama, Stanley Fischer. 
Marlin Gruber, Robert Iloltliausen. Michael Jensen, Richard Eellwich, John Long. 
Roberl Merton, Wayne Mikkelson. Michael Mus.sa. Chailcs Plosser, Stephen Ross, 
Richard Ruback, Cliltord Smith, Jerold Warner. Waller Wasscrlallen, Jerold Zimmer¬ 
man, and anonymous referees have provided valuable comments. Support for ihis 
project has been provided by the National Science Foundation, the Center for Research 
in Security Prices at the Univcisity of Chicago, the Batterymarch Financial Manage¬ 
ment Corporation, and the Center for Research in Government Policy and Business at 
the University of Rochester 
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Our data are from Israel and consist of a monthly time series of 
Consumer Price Index (CPI) announcements and a daily time series 
of prices of government-issued bonds that are indexed to the CPI. 
After estimating prediction models for the Israeli CPI, we examine 
the relations between bond returns and the expected and unexpected 
components of the CPI. 

The Israeli CPI is announced approximately 2 weeks after the end 
of the month in which commodity prices are sampled. If bond traders 
are able to learn about inflation by observing the nominal commodity 
prices, there will be no reaction of bond prices when the CPI is an¬ 
nounced. On the other hand, if the bond market cannot infer the 
behavior of inflation by observing individual commodity prices, the 
announcement of the CPI will be associated with a change in bond 
prices to reflect the previously unknown information about the CPI. 
In principle, traders in the bond market have access to all the infor¬ 
mation that is used to construct the CPI, although it is probably pro¬ 
hibitively expensive for any one trader to duplicate the data collection 
and assimilation activities of the Central Bureau of Statistics. Never¬ 
theless, all traders have obvious pecuniary incentives to obtain infor¬ 
mation about the future behavior of the CPI, since the prices of 
' indexed bonds are essentially the outcome of bets about the future 
■ level of the CPI. 

This paper investigates the extent to which the market successfully 
i aggregates information that individuals hold collectively but possibly 
no single individual possesses completely. Many attempts have been 
made to incorporate uncertainty and diverse information of traders 
into equilibrium models of capital markets, including Grossman 
, (1976, 1978), Radner (1979), Grossman and Stigiit/. (1980), Diamond 
and Verrecchia (1981), and Admati (1983). These papers investigate 
whether prices can aggregate investors’ diverse information to the 
extent that they serve as a sufficient statistic for the aggregate infor¬ 
mation of traders, in which case the price is called “fully revealing." 
Suppose the announcement of the CPI merely aggregates informa¬ 
tion about commodity prices. This information was available to Ixmd 
traders at least a fortnight prior to the announcement. If bond prices 
are fully revealing, then the market should not react to the announce¬ 
ment when it occurs, because the information contained in the an¬ 
nouncement was impounded in prices 2 weeks earlier. 

A related question arises in rational expectations models of busi¬ 
ness cycles, suvh as Lucas (1973, 1975) and Barro (1980), In these 
models there are several markets in the economy, and there are re¬ 
strictions on the flow of traders and information across markets. 
Shocks are both economy-wide and local (i.e., vary across markets). 
Prices in each market reflect both local and economy-wide shocks, and 
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tinders are unaWr to determine whether a nominal price change is 
due to a change in the relative price oi the good or to overall inflation 
of all goods' prices. This confusion about inflation and relative price 
changes implies that monetary policy can have real effects. 

The empirical tests in this paper suggest that about 85 percent of 
die reaction of bond prices to unexpected inflation occurs contem¬ 
poraneously with the sampling of individual commodity prices, from 
ii to (> weeks prior to the announcement. The remaining 15 percent of 
die reaction to unexpected inflation occurs on the day following the 
announcement. Thus, while the evidence is inconsistent with the ex¬ 
treme hypothesis that indexed bond prices fully reflect the informa¬ 
tion about inflation as it occurs, the extent of confusion about infla¬ 
tion is not large. 


Schwert ( 1981 ), Cornell ( 1983 ), and Urich and Wachlel (1984) 
study the empirical relations between announcements of the U.S. CPI 
or money supply and the pric es of stocks and nominal bonds in the 
tilted States. I he tests in this paper are more powerful for two 
reasons. («) the payoffs to indexed bonds are directly linked to the 
• , and (/>) the variance of the unexpected component of the CPI 
has been much higher in Israel than in the United States. Therefore 
te ttsuhs m this paper show a much stronger relation between unex- 

prev‘ous n ltudies an Y ^ ^ ^ haS been found in 


Section II describes the inflation and bond price data and presents 
time-series models for the inflation rate in Israel. Given the estimates 
>f expected and unexpected inflation from the time-series models we 

”^rTp Tvs'll bWnd PriCeS to the com- 

! T . ‘ Usin K d; “‘y returns to a portfolio of indexed 

bonds the tests ,n Section III estimate the speed of adjustment of 

' " X< ( bo ' U pnccs to "formation about inflation. Section IV dis- 

r °S:r iVe ,me, r Uti ° nS ° f lh ' mufa in sec¬ 

tion III. Section V conta.ns brief concluding remarks. 


II. Inflation and Bond Price Data 

A. The Consumer Price Index 


The Consumer Price Index (CPI) for Israel is comp, led by the Central 

,S H Wd br0ad connjmpticm 

throughout Israel servi «* about 1,500 locations 

tnrougnout Israel). The index for month t is announced after the 

y nanctal contracts such as indexed bonds linked to the CPI, the 
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Central Bureau of Statistics tries to avoid leaks of information prior to 
the official announcement. Nevertheless, consumers observe the 
prices of individual commodities at the same time that the Centra/ 
Bureau of Statistics collects the sample of prices that eventually aggre¬ 
gate into the CPI. Detailed information about the construction of the 
Israeli CPI is available in Israel Centra! Bureau of Statistics (1968). 

To analyze the reaction of bond prices to information about infla¬ 
tion, it is important to determine the part of CPI inflation that is 
expected before the month when the inflation occurs. The difference 
between the actual inflation rate for January, announced on February 
15, and the expected inflation rate based on information available on 
January 1 represents the information that could be learned about 
inflation between January 1 and February 15. Thus, it is the timing of 
the reaction of bond prices to unexpected inflation that is of primary 
interest. 

One way to measure the expected and unexpected components of 
the CPI inflation rate is to use a statistical time-series model to predict 
future inflation based on past inflation rates. This approach has been 
used frequently with U.S. CPI inflation data, by Hess and Bicksler 
(1975), Nelson ( 1976 ), Nelson and Schwert (1977), and Schwert 
(1981), among others. Part A of table 1 contains means, standard 
deviations, and the first 12 autocorrelations of the monthly Israeli 
CPI inflation, p,. from 1952 to 1981 and for several subperiods. Part B 
of table 1 contains summary statistics for the first differences of the 
CPI inflation rate, Ap,. The autocorrelations of the inflation rate are 
large at all 12 lags for the overall 1952—81 sample period and lor the 
1972-81 subperiod. The autocorrelations of the changes in inflation 
in part B are generally small, except at lags 1,2, 11, and 12. These 
results indicate that the stochastic process generating Israeli inflation 
is nonstationary. Moreover, the pattern of autocorrelations in part B 
suggests that the changes in inflation follow a second-order moving 
average process with a seasonal moving average component. 1 

Table 2 contains estimates of this model for several sample periods 
along with some tests for the adequacy of the model. The estimates of 
the model parameters in table 2 are fairly stable across the three 10- 
year subperiods. However, the test statistics for constancy of the pa¬ 
rameters within the 10-year subperiods are significant at the 5 percent 
level, except for the 1972-81 period. The Box-Pierce (1970) test sta¬ 
tistics indicate that the autocorrelations of the residuals from these 
models are small. The Studentized Range test statistics, S.R.(u), test 
the hypothesis that the distribution of residuals is normal with con- 


1 See Box and Jenkins (1976) for a dUcuuion of nonstationary lime-serie* proceases 
such as thii. 
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scant mean and variance. These statistics are large in many of the 
subperiods, indicating that there are outliers (i.e., large positive or 
negative unexpected inflation) or changes in the variance of unex¬ 
pected inflation. The estimates of the residual standard deviation, 
S(u), indicate that the variance of unexpected inflation was about four 
times higher in 1972-81 than in 1962-71, for example. 

Based on the stability of the coefficient estimates in table 2, it seems 
reasonable to assume that the predictions and prediction errors from 
these models can be used to approximate the expected and unex¬ 
pected components of the CPI inflation rate. In the subsequent tests 
using bond returns, expected inflation will be estimated as the predic¬ 
tion from a time-series model like those in table 2, where the model 
parameters are estimated using the most recent 60 months of infla¬ 
tion data. Our measure of unexpected inflation will be the prediction 
error from this time-series model. 


H Indexed Bonds 

Indexed bonds are widely held and actively traded in Israel. In 1976 
indexed bonds represented about 67 percent of the total market value 
of listed securities and about 30 percent of the trading volume on the 
Tel Aviv Stock Exchange. In addition, a large amount of bond trad¬ 
ing occurs in the over-the-counter market in Israel. There is also a 
large volume of option bonds outstanding where the holder can 
choose to receive either a fixed payment or a partially indexed payoff 
when the bond matures. 2 

Daily prices of indexed bonds are collected from the Tel Aviv Stock 
Exchange's Official Quotations for 77 months between January 1970 and 
January 1979. Continuous data are available from January 1970 
through August 1971 (402 observations), from January 1974 through 
June 1975 (329 observations), and from October 1975 through Janu¬ 
ary 1979 (777 observations). ' We use an equally weighted portfolio of 
12 actively traded indexed bonds to measure a daily holding period 
return, R,. This portfolio represents a sample of the variety of in¬ 
dexed bonds issued by the Israeli government over this period. It 
contains coupon bonds with coupon rates ranging from 3 to 7 percent 
and maturities at issue of between 5 and 10 years. Once a bond enters 
the portfolio it stays until about 3 months prior to maturity, at which 
point it is replaced by another bond of between 5 and 10 years to 
maturity. The I el Aviv Stock Exchange delists bonds that are within a 

1 bank of Israel (1977, chap. 19) contains a detailed description of securities markets 
in Israel, 

1 I he data were collected from the library of l ei Aviv University's Recanati Gradu¬ 
ate School of Business Administration. Because some of the quotations may be missing, 
there may be some days where trading (xcurred bin for which we have no price data 
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few months of maturity. Thus, the portfolio represents a mixture of 
maturities at any point in time. On maturity, the bondholder receives 
the par value multiplied by the change in the CPI since the bond was 
issued. Some bond issues have only partial indexation (80 or 90 per¬ 
cent). For an 80 percent indexed bond, the payoff at maturity is 20 
percent of the par value plus 80 percent of the par value times the 
growth in the CPI since the bond was first issued. The coupon pay¬ 
ments are not indexed. 

There are several reasons the prices of indexed bonds might vary 
over time in addition to changes in the CPI. First, there is some 
probability of default, either as a result of wars or because the Israeli 
government changes the terms of the bond contract ex post. ’ Second, 
the Bank of Israel apparently intervenes in the bond market with the 
intention of affecting bond prices. s Finally, even if the principal and 
coupon payments were fully indexed and there was no default risk, 
the prices of long-term bonds would change if real discount rates 
varied over time. Thus we do not expect to find that all of the varia¬ 
tion in bond prices is attributable to variation in expected or unex¬ 
pected inflation. 

In principle it would l>e belter to incorporate data on coupon yield, 
maturity, and so forth into our analysis. Unfortunately, such data are 
not easily available. The data as we have them are good enough for 
our purpose since we are interested in the timing of the market re¬ 
sponse to unexpected inflation, not the absolute magnitude of the 
response. If we had data on more bonds and on the detailed charac¬ 
teristics on each bond, we could presumably construct a more power¬ 
ful test of the timing of the market response to unexpected inflation. 1 ’ 

III. Information about the CPI and Bond Price 
Behavior 

A. Effects of Information Release on Bond Price 
Variability 

One way to analyze the effects of the CPI announcement on indexed 
bond prices is to examine the variability of bond returns on the days 

1 Bank of Israel (1977, pp 442-43) for a discussion of government actions taken 
in December 1975 that affected payoffs to subsequently issued indexed bonds and the 
associated (ear that previously issued bonds would also be affected. At this time the 
government also prohibited institutional investors front tlading indexed bonds on 
the f el Aviv Stock Exchange. 

See Bank of Israel (1977, pp, 448—50) lor a discussion of the trading activity bs the 
Bank of Israel. 

' An unpublished appendix to this paper illustrates the effects of nontndexed 
coupons, partial indexation of principal, term to maturity, and varying real discount 
rates on the sensitivity of indexed bond prices to unexpected inflation. This appendix is 
available from the authors. 
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TABLE 3 

Estimates of tut Siandakd Deviation of Indexed Bono Returns on the Days 
around CPI Announcements 


Day Relative to the 
CPI Announcement, t 

-5 

-4 

-3 

-2 

- 1 
0 

1 

2 
3 

1 


Standard Deviation, tr, 


.00773 

.00680 

.00700 

.00820 

.00501 

.00929 

.00803 

.00939 

00603 

.00636 

.00689 
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surrounding the CPI announcement. II new information is released 
as a resuii of the announcement, variability should be higher on the 
day after the announcement. 

Table 3 contains estimates of the standard deviation of bond re¬ 
turns for 11 trading days around the announcement day. For ex¬ 
ample, (To represents the sample standard deviation of the bond re¬ 
turns on the first trading day after the CPI announcement (day 0) 
based on the 77 announcements in our sample, l’able 3 also contains 
sample standard deviations for 5 trading days before (days - 5 to — 1) 
and 5 trading days after (days 1 to 5) the announcement day. The 
standard deviation of bond returns on the day following the an¬ 
nouncement is the second highest among the 11 estimates in table 3. 
I he standard deviation on day 0 is about 25 percent greater than the 
estimate of the standard deviation of bond returns lor the entire 
sample. 

This crude test suggests that there is information in the CPI an¬ 
nouncement that was not previously available. However, since the 
evidence in tables 1 and 2 indicates that the CPI inflation rate was not 
.stationary over this sample period, it is doubtful that the process 
generating bond returns was stationary. In this case the sample stan¬ 
dard deviations in table 3 would have unknown statistical properties 
and statistical tests based on these estimates would be unreliable. 

Table 4 contains nonparametric tests based on ranks of the daily 
returns within each month. These nonparametric tests should correct 
for nonstationarity in the process generating bond returns due to the 
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nonstationary inflation rate. We rank the daily bond returns within a 
month from lowest to highest, 1 to N„ where N, is the number of 
trading days in month t. X he days are labeled relative to the an¬ 
nouncement day, so i = 0 is the day after the announcement, i = - 1 
is the day of the announcement, i = 1 is the second day after the 
announcement, and so forth. Thus, for each day i relative to the CPI 
announcement within month t there is a ranking Af,,, where 1 ^ Af,, ^ 
y \’,. To normalize the rankings with respect to the number of observa¬ 
tions per month, we define the variable X,, — (Af,, — l)/(Af, — 1), 
which varies between zero and one. Finally, since we are interested in 
either very high or very low returns associated with the announce¬ 
ment of the CPI. we create the variable Y„ = 2|X„ — ,5|, which also 
varies between zero and one. Under the null hypothesis that bond 
returns are unaffected by the CPI announcement, both X u and Y u 
should be approximately uniformly distributed across months t for a 
given day i. Unless we know whether inflation is unexpectedly high or 
low when it is announced, we have no hypothesis about Xo,, but Foi 
should be large if bond returns are either high or very low after the 
announcement. To test this hypothesis we regress Y„ against the set of 
dummy variables D,,, 

Y u ~ P + ^ &.A, + u„, (1) 

>= -r> 

where D„ - 1 when the observation occurs on day i relative to the 
announcement, and D„ = 0 otherwise. For example, Do, = 1 on the 
day after the CPI announcement (day 0). The coefficient 8, measures 
the differential dispersion in bond return rankings on day i relative to 
the CPI announcement. 

The results in table 4 show that bond returns are more variable on 
the day following CPI announcements. In all but the 1970—71 sub¬ 
period, the estimates of 5 ( j are positive, and for the total sample they 
are more than two standard errors above zero. There is no indication 
that any of the other dummy variable coefficients are different from 
zero, so it seems that the announcement day has more price variability 
than the other 10 trading days around the announcement. This crude 
test indicates that there is at least some information in the announce¬ 
ment that is not available prior to the announcement. The tests that 
follow give a more detailed picture of this phenomenon. 

B. Effects of Inflation on Daily Bond Returns 

To measure the reaction of indexed bond prices to new information 
about inflation, we are most interested in the relation between bond 



INDEXED BONDS lOg 

returns and unexpected inflation, ti t . The current unexpected infla¬ 
tion rate has two effects on the price of an indexed bond. The pre¬ 
dicted future level of the CPI is changed by the amount of current 
unexpected inflation. In addition, the future levels of expected infla¬ 
tion are increased (decreased) as a result of positive (negative) unex¬ 
pected inflation . 7 Thus, the predicted redemption payoff is increased 
by current unexpected inflation. The increase in future expected in¬ 
flation will probably increase nominal interest rates, which decreases 
the value of future cash flows. 

One way to determine the time when the bond market first becomes 
aware of the unexpected inflation rate is to analyze daily bond returns 
around the announcement of the CPI. To test whether the announce¬ 
ment conveys information that is not already reflected in bond prices, 
we could regress the bond return for the first trading day after the 
announcement on the unexpected inflation rate corresponding to the 
announced CPI, 

R, = a + 7 „u, + e,. (2) 

For example, if the January CPI is announced on February 15, R, is 
the bond return on February 16 and u, is the unexpected inflation for 
January. Since we have 77 months of bond return data, the regression 
in (2) would use 77 day 0 bond returns to estimate the coefficient of 
unexpected inflation y 0 . If the announcement of the CPI affects bond 
prices, y„ should be positive. If the bond market is able to use sources 
other than the official announcement to find out about the inflation 
that occurred in January, there should be no adjustment of bond 
prices as a result of the official announcement on February 15, and >0 
should equal zero. 

As a conventional test of capital market efficiency, we could esti¬ 
mate the reaction, if any, of bond prices on the days following the CPI 
announcement. Defining R, + k as the bond return on the (k + l)stday 
following the announcement of the CPI and ti, as the unexpected 
component of the corresponding CPI, the regression 

Rt+k = “ + 7*w< + e ( + * (3) 

can be used to estimate any reaction of bond prices that occurs after 
day 0 . If the coefficients 7 * are nonzero, it implies that bond prices are 
slow to react to the CPI announcement. 

Similarly, by looking at the regression of bond returns for the (A - 
l)st day prior to the CPI announcement /?,_* we could estimate the 

7 The persistent positive autocorrelations of the inflation rates in table 1 indicate that 
:urrent unexpected inflation increases expectations of inflation for many future pe¬ 
riods. 
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reaction ot bond ju ices to unexpected inflation prior to the announce¬ 
ment: 


R, - * = « + V a til + «I -a- (4) 

For example, if there are about 20 trading days per month and about 
10 trading days between the end of the month and the subsequent 
CPI announcement, there are about 30 trading days between the 
beginning of January and the announcement of the January CPI. 
Therefore, we could estimate up to 30 different regression 
coefficients y . * to measure the timing of the reaction of bond prices 
to the unexpected inflation rate. If bond traders can infer some of the 
new information about inflation before the announcement, some of 
tiie coefficients y A should lie positive. For example, suppose that 
inflation occurs throughout January that is not predictable from the 
information available at the end of December. Bond prices might 
increase at the same time that the unexpected inflation occurs, be¬ 
cause bond traders cart observe the simultaneous unexpected in¬ 
creases in the {trices of a variety of consumption goods (i.e., there is 
little confusion about changes in relative prices versus overall infla¬ 
tion). If this scenario is typical, the coefficients representing the days 
when the inflation occurs, y to y n. should be positive. If bond 
tiaders can infer the inflation rate as it occurs, the subsequent an¬ 
nouncement by the Central Bureau of Statistics is redundant and the 
coefficients y l( , to yshould lie zero 
The regiessions in (2), (3). and (4) lotus on the effect of unex¬ 
pected inflation on daily bond returns. It is likely that expected infla¬ 
tion also affects bond returns, since the expected real return to hold¬ 
ing the bond is die expected nominal return minus the expected 
inflation rate. The autotorielanons in table 1 indicate that the ex¬ 
pected CPI inflation rate varied substantially during the 1970-79 
period. I herefore, we include a measure of the daily expected infla¬ 
tion rate (the prediction of monthly inflation based on the 60 most 
recent months of (-1*1 data divided by the number of trading days in 
the month) in the regressions to reflect the fact that expected bond 
teturns should be highei when expected inflation rises. For example, 
for the announcement day returns, equation (2) would be modified by 
including a measure of expected inflation, p,, 

Ri = a + flp, + a„i2, + (5) 

Since expected and unexpected inflation are uncorrelated, the esti¬ 
mate of the coefficient of unexpected inflation y„ should be unaf¬ 
fected. However, to the extent that bond returns are higher when 
expected inflation is high (so that fi is positive), adding the expected 
inflation variable in (5) will decrease the variance of the errors and 
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improve the precision of our estimates of the effects of unexpected 
inflation. In effect, we are modeling the nonstationarity of inflation 
that was discussed in Section IIIA above. 

If expected nominal returns are positively related to expected infla¬ 
tion (which is presumably the motivation for buying an indexed 
bond), the coefficient of expected inflation should be positive, (3 > 0. 
If the expected real returns to indexed bonds are constant over time, 
the coefficient of expected inflation should equal unity, (3 = 1.0. 

Instead of estimating many separate regressions of the type illus¬ 
trated by (2), (3), (4), and (5), we combine all of these coefficients into 
a multiple regression. 

- 5 

fit = a + (ip, + X T -*«< + * + «#• (6) 

A - 3(1 

The variable p, is used to measure the effect of expected inflation on 
daily bond returns. Two values of the expected inflation variable are 
used each month. For the trading days after the CPI announcement, 
p, is equal to the one-step-ahead forecast from a time-series model like 
those estimated in table 2. The model is estimated using the 60 most 
recent monthly inflation rates, and the prediction of monthly inflation 
is divided by the number of trading days in the month. For the trad¬ 
ing days to and including the CPI announcement, p, is equal to the 
two-step-ahead forecast based on 60 months of data not including the 
current value of the CPI. This prediction of monthly inflation is also 
divided by the number of trading days in the month. H 

For example, the December CPI is not announced until January 15; 
therefore, until January 16 we base our prediction of January's infla¬ 
tion on the 60 months of data up through November. If this two-step- 
ahead prediction of inflation is .04 (4 percent per month) and there 
are 20 trading days in January, p, = .04/20 = .0020 for each of the 
trading days from January 1 through January 15. As of Januarv 16 
the December CPI is known, so we use the new one-step-ahead fore¬ 
cast of inflation. Suppose the new forecast is .03 (3 percent per 
month); then p, = .03/20 = .0015 for each of the trading days from 
January 16 through January 31. The expected inflation variable is 
transformed into units of trading days to be comparable with the 
bond returns. 

The variable u, equals the unexpected component of the CPI on the 
day after the announcement and tero otherwise. Thus, on February 

" VVe also considered regression specifications lhai allowed the expected bond return 
to vary by the day of the week or by the numbet ot days since the last trade. We also 
allowed the variability of bond returns to be related to the number of days since the last 
trade. None of these specifications improved the statistical properties of the regressions 
reported in the text. 
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16 a, equals the unexpected inflation rate for January based on the 
one-step-ahead forecast of inflation using 60 months of data includ¬ 
ing December’s CPI. The multiple regression in ( 6 ) is a convenient 
way to estimate unexpected inflation coefficients for up to 30 trading 
days prior to the announcement ("y-so) and up to 5 trading days after 
the announcement ( 75 ). For the days between the beginning of Janu¬ 
ary and the announcement of December’s CPI on January 15, we use 
a measure of unexpected January inflation based on the two-step- 
ahead prediction from November. Thus, for days -30 to -20, the 
expected inflation variable does not use information about the previ¬ 
ous month’s CPI (which is not yet announced). 

Since there arc approximately 20 trading days per month, the 
coefficients 7 30 107^1 represent daily responses to unexpected in¬ 
flation during the period when the inflation occurs, the coefficients 
7 .10 to 7.1 represent responses of bond prices between the end of the 
month when the inflation occurs and the subsequent announcement, 
the coefficient 70 measures the response of bond prices on the day 
after die announcement, and the coefficients 71 to 75 represent any 
response of bond prices during the next 5 trading days after the 
announcement. The total effect of unexpected inflation on bond 
prices is measured by adding up the coefficients y _ 30 to 75 . 


C. The Speed of Reaction of Rond Prices to Unexpected 
Inflation 

I able 5 contains estimates of regressions similar to ( 6 ) for the 1970- 
79 sample period and the three subsamples, 1970—71, 1974—75, and 
1975-79. To economize on the number of parameters to be es¬ 
timated. the results in table 5 impose the constraint that the 
coefficients of unexpected inflation are equal within each 5-day inter¬ 
val from —30 to -26, -25 to —21, and so forth. The coefficient for 
the announcement day, 70 , is estimated separately. 

The last column in table 5 measures the total response of bond 
returns to unexpected inflation by summing the weekly coefficients, 
multiplying this sum by 5.0, and adding the announcement day 
coef ficient, y 0 . This estimate of the total response of bond returns to 
unexpected inflation helps put the magnitude of the announcement 
day coefficient in perspective. Even though there is a statistically 
significant announcement day effect in table 5, the announcement 
day reaction of bond prices is only a small part of the reaction to 
unexpected inflation. 

Several things are notable about the results in table 5. First, the 
results for the January 1970 to August 1971 subperiod are weak 
compared with the other two subperiods, or with the overall sample, 
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Sum of Unexpected Inflation Coefficients, 9^ 



Days Relative to Announcement Date, k 

KfecCs of tinexpecled inflation on Israeli Indexed Bond Returns 

l it,. 1 ‘-‘-Cumulative cfleits ol unexpected inflation, January 1970-January 1979 


so most ot the subsequent remarks do not apply to the first subperiod. 
Second, the effect of expected inflation on nominal bond returns is 
positive for the total sample period. More important, the coefficients 
for the unexpected inflation variables in table 5 are generally positive 
and some are several standard errors above zero. For example, foi 
the total sample ol 1,508 daily observations, the announcement da; 
coefficient is .081 with a standard error of .033, and the estimates o 
7- n to y ir . and y 2 i to y 25 are more than two standard error 
above zero. These estimates suggest that about 14 percent of die tota 
effect of unexpected inflation on bond returns occurs on the day th 
CPI is announced, while about 86 percent of the effect occurs durin 
the 11-25 days prior to the announcement. In other words, most r 
the adjustment occurs within the month the inflation actually occur 
but there is still a substantial additional adjustment on the announo 
ment day. 1 his pattern is similar for the subperiods. 

1 he timing of the reaction of bond prices to unexpected inflation 
illustrated in figure 1. I his graph is based on an unconstrained esl 
mate of equation (6) using 1,508 daily observations between Januai 
1970 and January 1979. Figure 1 shows how the effects of une 
petted inflation on bond prices accumulate during the 6 weeks fro 
the beginning of the month up through the announcement of tl 
CPI. This plot is the cumulative sum of the coefficient estimates 
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equation (6), 2*= _ 3 o *V., where k = - 30.5. The cumulative sum 

rises smoothly from day - 23 up to day - 9, where it is approximately 
.59. There is no additional change until the announcement day, when 
it rises to about .71. Thus, an unexpected 1.0 percent increase (de¬ 
crease) in the CPI is associated with a 0.71 percent increase (decrease) 
in bond prices during this period of 30 trading days. It appears that 
about 85 percent of the adjustment of bond prices occurs between 
days -23 and -9, when the inflation occurs, and the remaining 15 
percent adjustment occurs immediately after the CPI is announced. 
The two standard error range around the cumulative sum at day 0 is 
from 0.33 to 1.09. 

IV. Effects of Measurement Error in Expected 
and Unexpected Inflation 

Our measure of unexpected inflation is a proxy for the new informa¬ 
tion that becomes available to the bond market between the beginning 
of a month and the announcement of that month’s CPI. The unex¬ 
pected inflation measure is merely a proxy because the time-series 
prediction model may not fully capture the bond market's informa¬ 
tion at the beginning of each month. Several types of errors could 
affect the estimates in figure 1 and in table 5. Some extreme cases 
help put the previous results in perspective. 

First, suppose that the bond market has perfect foresight, so that 
the actual inflation rate is perfectly predictable from information 
available to bond traders, although it is not perfectly predictable from 
past inflation rates. In this case, the distinction between expected and 
unexpected inflation is meaningless, and the statistical model used to 
measure unexpected inflation would be a poor proxy for the new 
information that becomes available to the bond market. There should 
be no reaction of bond prices to the announcement of the CPI. Thus, 
the fact that bond prices seem to be positively related to our measure 
of unexpected inflation on the day after the announcement indicates 
that our statistical model does proxy for the unexpected component 
of the CPI. 

Another question that arises concerns the effects of random or 
systematic measurement errors in the calculation of the CPI. Suppose 
that the level of the CPI contains a serially random measurement 
error each month (possibly due to sampling error in collecting a sam¬ 
ple of consumer goods prices). Since the payoff on indexed bonds is 
linked to the level of the CPI in the month of maturity, random 
sampling errors in the current month should not affect bond traders' 
expectations of the future level of the CPI. In this case, any measure 
of unexpected inflation would contain two parts, one that represents 
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new information about current inflation and future expected infla¬ 
tion and a second part that is pure measurement error. The larger the 
variance of the measurement error relative to the variance of the 
underlying unexpected inflation rate, the lower will be all of the esti¬ 
mates of the unexpected inflation coefficients. In effect, this is a form 
of the errors-in-variables problem. In this case, all of the unexpected 
inflation coefficients will be biased toward zero, but the relative mag¬ 
nitudes should not be affected. Therefore, the timing of the response 
of bond prices to unexpected inflation would be correctly measured 
by the coefficients in figure 1, for example, but the overall magnitude 
of the effect would be underestimated. Given that the estimates of 
many of the unexpected inflation coefficients in table 5 are more than 
two standard errors greater than zero, it seems that the problem of 
random measurement errors in the CPI is not serious. 

It is possible that current errors in the CPI could have a permanent 
effect on future levels of the CPI. <J A large number of commodity and 
service prices are controlled by the government and are automatically 
adjusted to reflect the current level of the CPI. Also, many labor and 
rental contracts are indexed to the CPI or to the U.S. dollar. Thus, it 
is possible that random sampling error could cause a permanent 
change in the CPI. Since the terminal payoff on indexed bonds is 
linked to the CPI, it is possible that bond prices would react to the CPI 
announcement, even if “unexpected inflation” were due to sampling 
errors by the Central Bureau of Statistics. Thus, it is possible that 
bond prices fully reflect the “true" inflation rate at the time when the 
inflation occurs, and the subsequent reaction of bond prices when the 
CPI is announced merely reflects the sampling error of the govern¬ 
ment agency. From this perspective, the proportion of the reaction of 
bond prices to unexpected inflation that occurs after the announce¬ 
ment is an upper bound on the extent to which bond prices fail to 
reflect the “true” inflation rale. 

Another source of error in our estimates of expected and unex¬ 
pected inflation concerns the time at which estimates of expected 
inflation are revised. Recall that in table 5 two different estimates of 
expected inflation are used within each month. For the days from 
January 1 through January 15, the two-step-ahead forecast of infla¬ 
tion based on the November CPI is used. Since the December CPI is 
available on January 16, the remaining days in the month use the one- 
step-ahead forecast. This procedure is conservative in the sense that it 
assumes that bond traders know nothing about the December CPI 
until it is announced on January 15. Based on the estimates of the 


We would like to thank Michael Mussa for this argument. 



111 


indexed bonds 

response of indexed bond prices to unexpected inflation in table 5, it 
seems that bond traders know most of the information about the 
December CPI by January 1. Therefore, the estimates of expected 
and unexpected inflation from January 1 through January 15 in table 
5 probably use less information than the market. 

To illustrate a different assumption about the information available 
to bond traders, table 6 contains estimates equivalent to those in table 
5 , except that the one-step-ahead forecast of January’s inflation is 
used for each day in the month. In other words, the regressions in 
table 6 assume that bond traders know the December CPI as of Janu¬ 
ary 1. Interestingly, the estimates of the coefficients of both expected 
and unexpected inflation are larger in table 6 than the corresponding 
estimates in table 5. For example, for the total sample period January 
1970 through January 1979 the estimate of the coefficient for ex¬ 
pected inflation increases from .424 to .648 and the estimate of the 
total effect of unexpected inflation increases from .600 to .720. Al¬ 
though the magnitude of these estimates increases, the pattern of 
timing of the effect of unexpected inflation on bond returns is virtu¬ 
ally the same as in table 5. Approximately 84 percent of the response 
occurs from 11 to 25 trading days prior to the announcement, anti 
about 11 percent occurs on the day after the announcement. 

We interpret the differences in the results between tables 5 and 6 as 
an illustration of the effects of measurement errors in our proxies for 
expected and unexpected inflation. Recall that the only differences in 
the regressions concern the proxies for expected and unexpected 
inflation during the first half of each month. In table 5 we assume that 
the previous month’s CPI is completely unknown until it is an¬ 
nounced. In table 6 we assume that the market knows the previous 
month’s CPI by the end of the month when the prices are measured. 
The estimates of expected and unexpected inflation that are used in 
table 6 contain more current information than the estimates in table 
5. If the bond market really does have such current information 
about inflation, the proxies used in table 5 contain measurement er¬ 
rors, and it is not surprising that the coefficient estimates for both 
expected and unexpected inflation are closer to zero. Of course, it is 
well to reiterate that the pattern of timing of the response of bond 
prices to unexpected inflation is unaffected. 10 


(,iven that most of the reaction of bond prices to unexpected inflation occurs 
between days - 20 and - 10, we also considered the possibility that the period between 
CPI announcements was the relevant period of analysis. This would be from day -20 
to day 0. The results from estimating this specification, constraining the coefficients for 
days - 30 to - 21 to equal zero, are very similar to the results in tables 5 and 6 and 
fig. 1. 
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V. Summary and Conclusions 

The market for consumer-price-indexed bonds provides an opportu¬ 
nity to study the extent to which available information is reflected in 
security prices. We examine the timing of the reaction of bond prices 
to the occurrence and subsequent announcement of inflation in the 
CPI. Most of the reaction (about 85 percent) occurs from 2 to 5 weeks 
before the announcement, which is the period when the inflation 
occurs. There is no discernible reaction of bond prices during the 2 
weeks from the end of the month when commodity prices are mea¬ 
sured until the announcement of the price index. There is a 
significant reaction of bond prices on the day after the CPI is an¬ 
nounced (about 15 percent of the total reaction). Thus, it seems that 
bond prices reflect most of the information about inflation at the same 
time that the inflation occurs, but bond prices do not fully reflect the 
behavior of inflation, since there is a reaction to the formal announce¬ 
ment of the CPI. 

The evidence in this paper has implications for the theoretical liter¬ 
ature on asset pricing. Some models (most notably Grossman (1976]) 
show the existence of equilibrium prices that reflect all traders’ pri¬ 
vate information, even when information varies across traders and no 
trader collects information already available to others. To the extent 
that the CPI is just an aggregation of individual commodity prices, the 
announcement of the CPI does not add to the set of information 
available to all investors. Since bond prices react to the CPI announce¬ 
ment, they do not fully reflect the aggregate information available to 
all investors. 

Nevertheless, the evidence in this paper indicates that about 85 
percent of the reaction of bond prices occurs during the month when 
consumer goods prices are changing. Interestingly, there seems to be 
no additional reaction between the end of the month and the an¬ 
nouncement of the CPI 2 weeks later. Thus, even though bond prices 
do not completely reflect inflation as it occurs, it seems that most of 
the information about inflation is reflected in bond prices very 
quickly. This finding has implications for the monetarist rational ex¬ 
pectations business-cycle models, such as l.ucas (1973, 1975) and 
Barro (1980). In these models, current economy-wide data (such as 
the inflation rate) are not available to individuals, so their decisions 
are based on inferences about economy-wide data from observations 
in local markets (such as individual commodity prices). Consequently, 
people confuse relative price changes with overall inflation. Our 
findings suggest that this confusion is minimal, as most of the learning 
about unexpected inflation takes place as soon as inflation occurs. At 
the least, the results in this paper raise questions about whether mis- 
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perceptions of inflation are large enough to explain substantial fluctu¬ 
ations in real activity. 
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This paper provides estimates of the cost of protection to the Cana¬ 
dian economy for the mid-1970s. The estimates are based on an 
applied general equilibrium model incorporating scale economies, 
imperfect competition, and capital mobility. Both unilateral and 
multilateral tariff reductions are considered. The results of these 
experiments suggest that the cost of protection to the Canadian 
economy is considerably greater than that suggested by conventional 
general equilibrium analysis. Welfare gains from trade liberalization 
are found to be on the order of 8- 10 percent of GNP. Accompany¬ 
ing both trade policies is a rationalization of industries with a 
lengthening of production runs, lowering of price-cost markups, 
and increases in factor productivity. Experiments investigating the 
sensitivity of these results to the underlying parameters of the model 
arc also reported. 


This paper reports on some trade liberalization experiments under¬ 
taken with a recently constructed general equilibrium model of the 
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Canadian economy (Harris 1984a, 19846). The model incorporates 
features of industrial organization thought to be important in consid¬ 
ering the effects of international trade on small open economies. Of 
these, the most significant are the inclusion of economies ol scale and 
imperf ectly competitive market structures. Both of these features are 
thought to he important in assessing the costs of protection in small 
open economies. Some economists argue that the presence of foreign 
and domestic tariffs, by restricting domestic industry to produce for 
the small domestic market, leads to highly concentrated industries in 
which firms do not exhaust economies of scale. The result in the 
manufacturing sector is industries characterized by high costs and low 
productivity. T his view suggests that the cost of tariff protection is 
quite high. Freer trade, by subjecting domestic industry to increased 
foreign competition and allowing access to the larger world market, 
results in lower price-cost margins and in firms’ achieving longer 
production runs with lower average costs of production. 

This view is outside the traditional theoretical framework of neo¬ 
classical trade theory. Kconomists emphasizing scale economies and 
imperfect competition as important variables in estimating the impact 
of trade liberalization include Dales (1966), Balassa (1967), Eastman 
and Slykolt (1967), Wonnacott and Wonnacott (1967), and Corden 
(1972). 

Although the theoretical literature integrating industrial organiza¬ 
tion and international trade is growing quite rapidly (see, e.g., Krug- 
man 1979, 1980; Lancaster 1979; Brander 1981; Helpman 1981), to 
date the industrial organization (lO) approach has had little impact 
on empirical studies of the costs of protection. 1 In most empirical 
work, variants of the basic neoclassical trade model have been em¬ 
ployed. Examples of recent studies adopting this framework are 
Magee (1972), Boadway and Treddenick (1978), Cline et al. (1978), 
Williams (1978), Brown and Whalley (1980), and Deardorff and Stern 
(1981). The early partial equilibrium studies are summarized in Cor¬ 
den (1975). These studies have found that the benefits of trade 
liberalization are typically quite small, often on the order of 0.0—1.0 
percent of GNP. As these studies use models maintaining the assump¬ 
tions of constant returns to scale and perfect competition, the results 
have been viewed with some skepticism, particularly in the case of 


1 Most of these theoretical papers emphasize product differentiation, a feature not 
emphasized in the results reported here. Harris (1984S) reports on the effect ol incor¬ 
porating prod in I differentiation explicitly in the model and us quantitative impact, A 
recent paper by Pearson and Ingram (1980) incorporates scale economies looking at 
the empirical estimates of the benefits to economic integration Iretween Ghana and the 
Ivory Coast 
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small open economies. The purpose of the present paper is to report 
the impact on cost of protection estimates of the presence of econo¬ 
mies of scale and imperfect competition in an applied general equilib¬ 
rium (GE) model. The basic issue is whether or not the benefits of 
trade liberalization suggested by an lO approach, consistent with the 
data and an economic model, are significantly greater than conven¬ 
tional static estimates. 

The results, implementing the model on a 1976 Canadian data set, 
suggest that the gains are considerably greater than suggested by 
conventional GE analysis. The benefits to two trade liberalization 
policies are considered: a unilateral removal of domestic tariffs, and a 
multilateral removal of both domestic tariffs and foreign tariffs 
against domestic exports. The multilateral tariff cuts yield the largest 
benefits, with a gain in welfare equivalent to approximately 8.5 per¬ 
cent of national income. This number is substantially larger than 
those found with the neoclassical trade models cited above and is 
comparable to the figure reported by Wonnacott and Wonnacott 
(1967) for Canada in the mid-1960s. The mechanism by which the 
welfare gains are achieved is also in broad agreement with that sug- 
ested by the IO view. Accompanying both trade policies is a rational¬ 
ization of industries with a lengthening of production runs, a lower¬ 
ing of price-cost markups, and increases in factor productivity. The 
results indicate that rationalization effects play an important role in 
the adjustment of the economy to trade liberalization. The equilib¬ 
rium outcome of industry rationalization is both interindustry and m- 
tirundmtry adjustment. The results confirm the relative importance of 
intraindustry adjustment and suggest that for small open economies, 
neoclassical models may be seriously misspecified. We would caution 
the reader, however, that the results obtained from one data set are 
suggestive but not conclusive. If these results can be confirmed for 
other small open economies, with heavily protected manufacturing 
sectors, the proponents of free trade will be free of the “tyranny of 
triangles”—the low-welfare-cost estimates of protection. 

I he paper is organized as follows. An overview of the model is 
given in the first section. (The equations of the model are presented 
in the Appendix.) The discussion in this paper is limited to a brief 
verbal description. 2 In Section II the methodology used to calibrate 
the model is set out. In the third section the results of the trade 
liberalization experiments are presented; summary statistics are pre¬ 
sented and discussed at both the aggregate and the industry level. In 
order to examine the robustness of the results reported, sensitivity 

1 * 7 ° r further discussion of the mathematical model see Harris (1984a, 19846). 
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analysis is conducted with respect to a number of key parameters of 
the model. The conclusions of the study are summarized in the final 
section of the paper. 


I. Overview of the Model 

The general equilibrium (GE) model consists of 29 domestic indus¬ 
tries. Of these, 20 industries are characterized by economies of scale 
in production and imperfectly competitive market structures. The 
noncompetitive industries correspond to the Canadian manufactur¬ 
ing industries identified at the two-digit level of the SIC code. The 
remaining nine industries are perfectly competitive constant-cost in¬ 
dustries. Among these industries are the natural resource and service 
sectors of the Canadian economy. In the model, commodities are 
distinguished not only by their physical attributes but also by their 
location of production: domestic or foreign. Within each of the 29 
commodity categories, two commodities are therefore distinguished: 
the domestically produced good and its imported counterpart. Fol¬ 
lowing Armington (1969), for each commodity category a commodity 
aggregate is formed in which the domestic and imported goods are 
treated as close but imperfect substitutes by all demand categories. 
Thus intraindustry trade or “cross-hauling” is a characteristic of 
trade, and one given considerable emphasis by Balassa (1967) and 
Grubel and Lloyd (1975). In addition, another commodity category, 
noncompeting imports, exists in the model. This category consists of 
an imported good for which there is no (observed) domestically pro¬ 
duced substitute. 

In international markets, the Canadian economy is modeled as an 
“almost” small open economy. In the market for imports, domestic 
producers and consumers are assumed to take the price of each im¬ 
ported good as given and in perfectly elastic supply at the world price. 
In export markets domestic producers are assumed to be price mak¬ 
ers; that is, they face less than perfectly elastic demand curves for 
their products. ’ The elasticity of export demand facing producers will 
vary across industries. 

There are two primary factors of production in the model: capital 
and labor. Each factor is assumed to be homogeneous and mobile 
across industries and firms. Capital is internationally mobile and in 
perfectly elastic supply at the world rental rate; labor is internation¬ 
ally immobile. The domestic wage is determined in a perfectly com- 


1 For empirical support, in the Canadian case, of the “almost” small open economy 
hypothesis, see Appelbaum and Kohli (1979). 
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petitive labor market. The economy’s resource endowment consists of 
a fixed supply of labor and ownership of domestic capital. 

The model takes account of a number of tax and tariff distortions 
in the Canadian economy. All tax, tariff, and subsidy rates are ex¬ 
pressed in ad valorem form. Among the domestic taxes incorporated 
into the model are sales taxes on final domestic consumption, taxation 
of the use of intermediate goods by different sectors at different 
rates, producer subsidies, and export taxes. Tariff rates, both domes¬ 
tic and rest of world, are inclusive of ad valorem equivalents of non¬ 
tariff barriers. 

The technology of each competitive industry is represented by a 
unit cost function. The costs of each industry include not only labor 
and capital costs but also expenditures on the output of other indus¬ 
tries, both domestically produced and imported. The unit cost func¬ 
tion, assumed independent of industry output, is specified as a Cobb- 
Douglas functional form, defined over the input prices of the primary 
factors and price indices for each of the 30 commodity categories. 
The price index of each commodity aggregate is assumed to be a 
Cobb-Douglas subaggregator defined over the price of the corre¬ 
sponding domestically produced and imported commodities. With 
this specification of technology, substitution in production is not only 
possible between primary factors and intermediate commodity aggre¬ 
gates but, within each commodity aggregate, is also feasible between 
domestic and imported goods. 

The assumption of constant per unit costs, together with a zero 
profit condition, requires, in equilibrium, that price in each competi¬ 
tive industry be equal to unit cost. 

Each of the 20 noncompetitive industries consists of an endoge¬ 
nously determined number of firms. Within an industry all firms are 
assumed identical with respect to their technology and economic be¬ 
havior. Thus it is meaningful to speak of a representative firm within 
each industry. Freedom of entry and exit exists in all industries, so 
that firms will enter and exit industries in response to the presence of 
economic profits or losses. In this manner, in the long run, the num¬ 
ber of firms is determined endogenously. 

The cost function of each representative firm consists of both vari¬ 
able and fixed costs. The use of primary factors—capital and labor— 
enters into both the variable and fixed costs of the firm. Variable per 
unit costs are assumed to be independent of the level of output pro¬ 
duced by the firm The functional form of the firm’s unit variable cost 
function is identical to that of the industry unit cost function in the 
competitive industries. This is a Cobb-Douglas function specified over 
the input prices of primary factors and price indices of all commodity 
a 8& re g ates > where each price index is a Cobb-Douglas subaggregator. 



I 20 


JOURNAL OF POLITICAL ECONOMY 


So, at the level of the hi m in each noncompetitive industry, substitu¬ 
tion possibilities exist between intermediate goods (both domestically 
produced and imported) and primary factors. The fixed costs of the 
firm consist only of capital and labor costs. The presence of fixed costs 
in the firm’s cost structure is explained by an indivisibility; a fixed 
amount of capital and labor is required to set up a plant. The 
specification of constant per unit variable cost plus a fixed cost compo¬ 
nent leads, at given input prices, to declining average costs that 
asymptotically approach unit variable cost. In these circumstances, 
the minimum ef ficient scale (MES) of the firm is defined as that level 
of output at which average costs are within 1 percent of unit cost. A 
measure of the steepness of the average cost curve is given by the cost 
disadvantage ratio (CDR), which measures the percentage by which 
aveiage cost at an output level one-half of MES exceeds average cost 
at MES. 

Product spcctali/.ation or horizontal product diversity within the 
plant, as distinct from scale economies due to the plant size, has been 
emphasized by Balassa (1967) and Eastman and Stykolt (1967), 
among others. Specialization within the plant may be an important 
source of productivity gains were the plant to rationalize because of a 
cut in protection. In the model used in this paper no formal allowance 
is made for production specialization although the empirical scale 
economies estimates incorporate these costs implicitly. 

In each noncompetitive industry, firms are viewed as price makers. 
1 wo hypotheses regarding how prices are chosen by firms are consid¬ 
ered. The first hypothesis is based on the Negishi (1961) perteived- 
demand-curve approach. Each representative firm is assumed to per¬ 
ceive a constant-elasticity demand curve lor its product. On the basis 
of this perceived demand curve, the firm chooses a markup of price 
over unit cost that maximizes profits. The optimal markup chosen in 
this manner satisfies the familiar Lerner Rule. The elasticity the firm 
uses in its perceived demand curve corresponds to a “true” elasticity 
from the underlying general equilibrium model. Price setting in this 
manner will be referred to as the monopolistic competitive pricing 
hypothesis (M(.PH). The other pricing hypothesis considered will be 
referred to as the Eastman-Stykolt (1967) hypothesis (ESH). Under 
the ESH the firm sets its price equal to the price of the import- 
competing good, inclusive of the domestic tarif f . The ESH represents 
a collusive form of price setting in which the price of the import- 
competing good acts as a “focal point" for domestic producers. In the 
policy simulations of the model the actual price selected by the firm is 
taken to be a weighted average of the prices set according to the 
MCPH and ESH. 

Domestic final demand for each commodity is assumed to be 
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generated by a single consumer maximizing an aggregate utility func¬ 
tion. The utility function takes the form of a Cobb-Douglas function 
defined over the 30 commodity aggregates. With the exception of the 
commodity noncompeting imports, each commodity aggregate takes 
the form of a constant elasticity of substitution (CES) aggregator 
defined over the domestic and imported goods within each commod¬ 
ity category. Use of the CES subaggregator embodies the Armington 
assumption that the imported and domestic goods within each com¬ 
modity category are viewed as imperfect substitutes by the consumer. 
Given prices for all commodities and the disposable income of the 
consumer, the demand for each commodity is derived from the utility 
maximization hypothesis. 

The disposable income of the consumer is derived from three 
sources: ownership of the domestic resource endowment—labor and 
domestically owned capital, possible economic profits accruing to do¬ 
mestically owned firms in noncompetitive industries, and government 
transfers. Government revenue is raised through the system of taxes, 
tariffs, and subsidies in place. All government revenue raised in this 
manner is returned to the consumer in the form of a lump-sum 
transfer. Demand for goods and services by government is not mod¬ 
eled separately but assumed to be incorporated as part of the consum¬ 
er's demand. 

Demand for the exports of the domestic country is assumed to be 
generated by an aggregate rest-of-world consumer with exogenous 
income. For each commodity category, the exports of domestic coun¬ 
try and rest of world (ROW) are viewed as imperfect substitutes as 
represented by a GES aggregator. Gost minimization results in an 
export demand equation for each domestic good that is a function of 
the domestic commodity price, the price of ROW exports, and the 
foreign tariff on domestic exports. Using this specification of ROW 
behavior, a distinction is admitted within the export demand equation 
between the domestic price elasticity and the foreign tariff elasticity. 

A distinction is made in the model between the short run and the 
long run. The short run corresponds to a period of time during which 
the industrial structure in each of the noncompetitive industries is 
assumed fixed. By industry structure is meant the markup on unit 
variable cost set by each firm and the number of firms existing in each 
industry. A short-run equilibrium of the model is defined as a set of 
product prices, one for each domestically produced good, and a wage 
rate such that all product markets and the factor market clear. Wal¬ 
ras’s Law implies that the balance of payments is in equilibrium. Bal¬ 
ance of payments equilibrium refers to current account balance, or 
requires that the trade surplus be equal to the sum of rental payments 
on foreign-owned capital and economic profits accruing to foreign 
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ownership of domestic industry. Consistent with a short-run equilib¬ 
rium is the possibility that, in some industries, firms will be earning 
pure profits or losses. 

The long run of the model corresponds to a time horizon long 
enough to allow firms to enter or exit all industries in response to the 
presence of pure profits or losses. A long-run equilibrium is defined 
as a short-run equilibrium with the additional requirements that in 
each industry (approximately) zero profits be earned and that the 
elasticity of the perceived demand curve under MCPH be equal to the 
elasticity ol the firm’s true demand curve. 

I'he model has much in common with the traditional Heckscher- 
Ohlin trade model, in which comparative advantage plays the key 
role. 1’he comparative advantage effects are present in the model, but 
in addition there is scope for intraindustry rationalization. The in¬ 
teraction among scale economies, pricing by firms, and free entry 
provides mechanisms by which intraindustry adjustment occurs. Con¬ 
sider, for example, a cut in the domestic tariff on one industry. If this 
tariff cut forces domestic firms within the industry to cut price, the 
existing firms will make losses. Either all firms must expand output, 
realizing lower costs, or some must exit, leaving remaining firms a 
larger output In either case there is a cost efficiency gain as the 
industry produces at lower average cost. Subsequent interindustry 
shifts in resources will depend not only on relative factor intensities 
but also on relative scale economies between industries. It is generally 
difficult, if not impossible, to get theoretical predictions from a multi¬ 
industry model of this sort. 


II. Calibration of the Model 

The model contains a large number of parameters that must be as¬ 
signed numerical values. The general approach used to parameterize 
the model is typical of that followed in applied GE work. The parame¬ 
ters of the model are chosen by reference to existing econometric 
studies and so as to be consistent with a given historical data set— 
referred to as a benchmark data set. Once the model is parame¬ 
terized, the benchmark data set is reproduced as an equilibrium of 
the model. For the present model a benchmark data set was con¬ 
structed for the Canadian economy for the year 197b. This data set is 
presumed to represent a short-run equilibrium of the model in which 
the industrial structure of the noncompetitive industries is held con¬ 
stant. Thus, because of the p.^cence of plant-specific labor and capi¬ 
tal, which are taken as fixed in the short run, firms in different indus¬ 
tries may be making economic profits or losses. In the long run both 
of these factors are assumed to be variable; firms will enter and exit 
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industries so as to bring about a situation of zero economic profit 
across industries. The calibration procedure is to select values for all 
of the parameters in the model, taking as given the observed markups 
and number of firms in each industry, to be consistent with the 1976 
benchmark data set. The industrial structure variables are then deter¬ 
mined endogenously by computing the long-run equilibrium of the 
model. It is with reference to this long-run equilibrium, referred to as 
the reference equilibrium, from which the counterfactual policy ex¬ 
periments are undertaken. 

On the demand side, the exponents of the consumer’s Cobb- 
Douglas utility function defined over the 30 commodity aggregates 
are given by the observed expenditure shares in the benchmark data 
set. Within each commodity aggregate it is not possible, from a single 
observation, to infer the elasticity of substitution between the im¬ 
ported and domestic goods. In the absence of econometric estimates, 
the procedure used to select the elasticities of substitution utilizes 
estimated price elasticities of import demand. Values of the elasticities 
of substitution are selected so that the own-price elasticities of the 
import demand functions, evaluated at the benchmark equilibrium, 
are consistent with the estimated demand elasticities. The selection of 
the import demand elasticities is based on those reported by Hazle- 
dine (1981). The actual best-guess values of the demand elasticities 
used in the calibration procedure are somewhat larger than those 
reported by Hazledine. The reason for this is twofold. First, it is 
known that standard econometric estimates of these elasticities are 
biased downward in absolute value. Second, they are available at a 
level of aggregation above that used in the model. Disaggregation will 
cause the elasticity to rise because of substitution between the various 
commodities within any given aggregate (see Rousslang and Parker 
1981). In the reference equilibrium the import elasticities range from 
- 1.0 to -3.0. 

I he export demand functions require the specification of two elas¬ 
ticities: an elasticity of demand for each commodity aggregate and, 
within each commodity aggregate, an elasticity of substitution be¬ 
tween domestic and rest-of-world exports. Within the commodity 
a 88 re 8 ate ’ the elasticity of substitution is assigned the same value as 
the elasticity of substitution between domestic and imported goods. 

I hat is, it is assumed the degree to which Canadian and rest-of-world 
exports are viewed as substitutes in the world market is the same as 
the degree to which domestic and imported goods are viewed as sub¬ 
stitutes by domestic residents. Estimates of export price elasticities are 
based on those presented in the bibliography of trade elasticities com¬ 
piled by Stern, Francis, and Schumacher (SFS) (1976). For reasons 
advanced in the selection of import elasticities the best-guess values of 
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the export elasticities used in the model are somewhat higher than the 
SFS estimates. The export elasticities in the reference equilibrium 
range from a low of —0.9 to a high of —3.1. 

In noncompetitive industries the cost functions to be parameterized 
include both variable and fixed costs. The fixed labor and capital 
requirements of the firm are selected such that the imputed fixed 
costs, together with variable costs, lead to a minimum efficient scale 
(MES) and cost disadvantage ratio (CI)R) of operating at a scale less 
than MES that are consistent with observed estimates. Estimates of 
MES and CDR for each of the two-digit Canadian manufacturing 
industries are constructed from the econometric estimates presented, 
at the three- and four-digit SIC industry levels, in Fuss and Gupta 
(1979). It is generally believed that econometric estimates of MES and 
CDR are biased downward, particularly in small economies such as 
Canada’s. Engineering estimates are thought to be more reliable but 
are difficult to obtain on any comprehensive basis. To account for the 
downward bias in the econometric estimates, the best-guess estimates 
of MES and CDR used in the model were uniformly scaled up to a 
position approximately midway between the econometric estimates 
and the average engineering estimates. 1 

Once all of the parameters of the model, excluding the industrial 
structure variables, have been assigned numerical values, the long- 
tun equilibrium is computed. The algorithm used to compute the 
equilibrium mimics the Marshallian process of adding firms to indus¬ 
tries making economic profits and withdrawing firms from industries 
making economic losses. In qualitative terms the values of the eco¬ 
nomic aggregates—such as national income, the wage, and govern¬ 
ment revenues—are, in the long-run equilibrium, close to their short- 
run values. Finally, there is no assurance that the long-run 
equilibrium of the model is unique. This is a standard problem gen¬ 
eral equilibrium models encounter. In practice, however, this prob¬ 
lem has not arisen. A number of ad hoc tests, such as beginning the 
algorithm at different starling points, have been undertaken and in 
no cases have multiple equilibria been found. 

III. Results of the Trade Liberalization 
Experiments 

In this section the results of the trade liberalization experiments un¬ 
dertaken with the model will be presented. Two trade liberalization 

4 The source of the average engineering estimates is Gorecki (1978). The MES esti¬ 
mates used do not control for horizontal product diversity within the plant, Conse¬ 
quently, "stale economies" implicitly include higher fixed costs associated with a more 
highly diversified plant If tariff reduction leads to reduced product diversity these 
fixed costs would fall. The model does not allow this to happen, which implies that the 
estimates of welfare gain to tariff removal are, if anything, biased downward. 
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policies are considered: (1) removal of all domestic tariffs or unilat¬ 
eral free trade (UFT), and (2) removal of all foreign tariffs on domes¬ 
tic exports and domestic tariffs or multilateral free trade (MFT). 
These policy exercises take the form of counterfactual experiments. 
In each case tariffs are removed and the new equilibrium of the 
model is computed and compared with the reference equilibrium. 
Comparison of the two equilibria yields a prediction of the quantita¬ 
tive change in the values of all of the endogenous variables due to the 
policy change. Each of the experiments reported considers a compari¬ 
son of the long-run, zero-profit equilibria of the model. 

Within the model the “multilateral free trade” experiment is to be 
interpreted in a very narrow sense. Strictly speaking the simulation 
results refer to a case in which all other countries reduce their tariff 
barriers on Canadian exports but leave in place tariff barriers against 
all other countries' exports. This would rationalize the assumptions of 
constant world prices and exogenous export demand functions in the 
multilateral simulation. For the analysis of true world free trade, one 
would need a world general equilibrium model in which all prices and 
incomes are endogenous. The “multilateral cut” simulation, as con¬ 
ducted in this paper, is of interest since it isolates the effect that 
foreign tariffs have on the Canadian economy. We shall later argue 
that the quantitative results on multilateral free trade are lower 
bound estimates for the Canadian gains from world free trade. 

The foreign and domestic tariffs used in the trade experiments 
were derived from a variety of sources and include ad valorem equiv¬ 
alents of some of the relevant nontariff barriers. 5 Domestic tariffs 
range from a high of 33 percent on textiles and knitting mills to 0 
percent in some industries, such as mining. The unweighted average 
tariff for the manufacturing industries is 11 percent. Foreign tariffs 
range from a high of 53 percent on textiles to 0 percent on a number 
of nonmanufacturing products, with an unweighted average on 
manufactured industries products of 16 percent. 

In the model there are four important sets of parameters: import 
elasticities, export elasticities, MES and CDR estimates, and a parame¬ 
ter that governs, in the noncompetitive industries, the degree to 
which prices are set according to the two hypotheses—monopolistic 
competitive pricing (MCPH) and Eastman-Stykolt (1967) (ESH). The 
first three sets of parameters are based on empirical estimates known 


5 Domestic ad valo' em tarif f rates were calculated using data on total tariff payments 
by commodity provided by the Structural Division of Statistics Canada Sources for the 
ad valorem equivalents of domestic nonlariif barriers (NTB) were Hazledinc (1981, 
table 2) and Economic Council of Canada (1975, table 2-6) Foreign ad valorem tariff 
rates and ad valorem NTB equivalents were constructed from the rates given by What¬ 
ley (1980, table 8) lor the United States, the European Common Market (F.CC), and 
Japan 
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to be rather unreliable. The fourth parameter can be interpreted as a 
behavioral hypothesis. The results of the trade experiments are re¬ 
ported initially with the values of each of these parameters set at their 
best-guess values. Given the nature of all of these parameters, further 
sensitivity analysis is conducted to examine how the outcomes of the 
experiments vary as the values of the parameters are altered. The 
results are reported at two levels. First, results are presented for econ¬ 
omywide aggregates such as the wage rate and GNP. Second, results 
are given at the industry level. Since a large number of industry 
variables are presented it is not possible to examine in detail each of 
the industries—such a task is left to the interested reader. However, 
some discussion will be focused on why particular industries do well 
and others very poorly under the policy changes. 


Unilateral Free Trade (UFT) 

Aggregate Results 

A number of summary statistics that document the impact of UFT on 
the domestic economy are reported in table 1. With the removal of 
domestic tariffs, table 1 shows that the domestic economy experiences 
a welfare gain, measured by the Hicks equivalent variation as a per¬ 
centage of GNP in the reference equilibrium, of the order of 4 per¬ 
cent.' 1 Accompanying this gain in real income is an increase in the 
domestic wage of 10 percent. 7 An increase in real GNP of about 3.5 
percent is also obtained. A large portion of the gain in welfare can be 
attributed to the rationalization of the manufacturing industries. Re¬ 
moving the domestic tariff has the effect, on average, of increasing 
the length of production runs in the manufacturing industries by 
close to 40 percent. Through the rationalization of these industries, 
the lengthening of production runs of individual firms results in a 
decline of average fixed costs of just under 20 percent. 

The increase in cost efficiency in the manufacturing industries has 
quite an impact on factor productivity. In particular, there is an in¬ 
crease in aggregate labor productivity of 20 percent. The intersec¬ 
toral allocation of resources that accompanies UFT is quite inter¬ 
esting. In terms of employment, the manufacturing sector gains at the 
expense of the rest of the economy. This result is contrary to what one 
would expect on the basis of traditional comparative advantage, 

h The Hicks Equivalent Variation is defined as the amount of income that would have 
to be taken away from the consumer, after the tariff removal, such that the consumer is 
just as well off as he was in the initial situation, facing the prices prevailing in the initial 
situation. 

7 The domestic wage is expressed in real terms. It is measured in terms of a bundle of 
foreign goods imported at constant prices. 
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Variable 

Unilateral Free T rade Multilateral Free Trade 

Wage 

9 98 

25.21 

GNF. 

3 52 

12.58 

GNP (real) 

3-19 

7.02 

Welfare gain 

4.13 

8.59 

Length of production runs 

4 1.40 

66.84 

Average fixed costs 

- 18.93 

-29 94 

Labor productivity 

19.57 

32 62 

Total (actor productivity 

8.58 

9.50 

Tt ade volume 

53 14 

88 61 

Labor reallocation index 

3 93 

6.15 

Imiaindustry trade index 

-.70 

- 1.71 


Noif All idative changes are with rcspctt to the reference (all tariffs in plate) equilibrium M ) (»NF. and (A I* 
icier to gtoss national expenditure and product, respectively (2) The welfare gain is m> isuiecl as the I licks 
hquiv.tlenl Vanation as a percentage of initial l»NL (3) I he length of prndut non run index is n - weighted average 
ol output |km liim in each mamifat turing lnclnsirv, where the weights are the mdwstiics' shares i.' miai m.trnifatiui 
mg output (4) The index of average fixed costs is the weighted aveiage of average hxrd costs pc * him in each 
nunuiactmmg industry, where the weights ate die industries' sluies of total manufacturing output (S) Ltlwir 
productivity is defined as output fK*t unit of labor 1 he lalmf produc tivity index is defined as the weighted average of 
labor prcKluclivity in each industry, where the weights arc the industries* shares ol die total output of all industries 
(<>) fotal factor productivity is measured by a geometric quantity index of all inputs 1 he aggregate index of total 
factor productivity is the weighted average of total fat tot productivity m each industry, where the weights are the 
imfusuies* shares ol the total output of all industries (7) Filial trade volume is the sum of the value of cxpoits and 
impetus a«ioss all industries, including noneompeling imjx>rts (K) ‘I fie inir.unduslrv trade index is die weighted 
aveiage of the Balassa-Gruliel-l loyd (B(*l ) iiittaiticiustry trade index m each mdustrv. where the weights are (he 
industries’ shares of total tiadc volume I'hc BGI index is defined as 

- i tl ~ 

E, + M, ' 

vshcie k, and M, lepievent the value ol industry A cxpniis and iiiqxnis. n-spes lively (’>) I he l.tbui tealloc anon 
index measures the percentage of the total laboi that must reallocate lieiwren im In si ms 


whic h would argue, in the case of Canada, that trade liberalization 
would lead to an expansion of the primary sector at the expense of 
manufacturing. That this did not happen suggests that the increase in 
absolute productive efficiency in the manufacturing sector, brought 
about through rationalization effects, was sufficient to shift compara¬ 
tive advantage in favor of manufacturing. 

1 hese results suggest that there are significant benefits to be ob¬ 
tained from a policy of UFT, and furthermore that most of these 
benefits are achieved through rationalization of the manufacturing 
sector. 

Industry Results 

Statistics summarizing the impact of UFT on individual industries are 
presented in table 2. First, in terms of gross output and value added, 
UFI leads to an expansion in all of the nonmanufacturing industries. 
In the manufacturing sector, gross output increases in 75 percent of 
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12. Primary metals 42.85 30 24 42 59 11.24 3.09 - 18.92 

13 Metal fabricating 20 94 9.95 -252.43 14.11 5 08 -29.94 




14. Machinen 5 41 - .44 -221 02 12.25 3 27 -21 81 

15. Transportation 105 53 88 08 229.04 13 98 126 -5 99 

equipment 

16. Electrical products 5.80 -4 46 -244 07 15 23 5 90 -38.50 
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the industries, while value added increases in only eight of the 20 
industries. An interesting feature of the manufacturing sector is the 
magnitude with which changes in output occur. Within the manufac¬ 
turing sector 12 industries expand their output by greater than 10 
percent, while in the nonmanufacturing sector all relative output 
changes are less than 10 percent. Similar results are apparent with 
respect to changes in value added. 

The change in the pattern of trade is also quite revealing. In all 
industries both exports and imports rise. Thus, significant import 
substitution occurs, but at the same time all manufacturing industries 
increase their exports. Table 2 reveals that on balance, in 13 of the 
manufacturing industries, the increase in imports exceeds that of 
exports In the case of the nonmanufacturing sector all industries 
experience an inc tease in imports relative to exports. In the reference 
equilibrium the net surplus of the nonmanufacturing sector, exclud¬ 
ing noncompeting imports, is approximately $6.25 billion. In the 
manufacturing sector plus noncompeting imports there is a deficit of 
$4.1 1 billion. Under UFT tbe surplus in the nonmanufac turing sector 
falls to $3.18 billion, but the manufacturing sector moves into a sur¬ 
plus position of approximately $3.02 billion. Thus, under UFT, the 
trade surplus as a whole increases, offset by increases in capital im¬ 
ports, and both manufacturing and nonmanufacturing sectors move 
inio approximately equal positions in terms of iheir net surplus posi¬ 
tion. 

Under UFT, labor productivity increases in all industries. In the 
manufactui ing sector the increases are particularly large, with labor 
productivity increasing by at least 10 percent in all industries. In fact, 
m six industries labor productivity increases by over 20 percent, with 
the largest increase by clothing at 44 percent. The gains in labor 
productivity in the manufacturing sector are apparent from the in¬ 
crease in the values of the scale elasticities. In all industries economies 
of scale have been achieved by longer production runs. The reason 
for the increases in the length of production runs is apparent from 
examining what has happened to markups in each industry. In all 
indust!ies the markups have fallen and in 11 industries the fall has 
been quite dramatic—over 30 percent. Recall that under the ESH 
firms set their prices equal to the border price of the import-compet¬ 
ing good, inclusive of the domestic tarif f . The removal of the domes¬ 
tic tarif f will thus cause firms to lower their prices. In the presence of 
declining average costs, the lower markup means that firms must 
increase their output if they are to remain profitable. In equilibrium 
this larger output per firm must translate into fewer firms or larger 
industry output. As is evident from table 2, both results have oc¬ 
curred in many manufacturing industries. 



TRADE LIBERALIZATION 


»31 

An examination of the changes in output and employment in the 
manufacturing sector indicates that while some industries prosper 
other industries decline as a result of UFT. In order to understand 
better why some industries do well and others poorly, it is instructive 
to examine in detail an industry in each category. On the basis of 
employment, transportation equipment is the overwhelming winner, 
with an 80 percent increase in employment. The reasons why trans¬ 
portation equipment does well are quite apparent. In the reference 
equilibrium, the ratio of fixed costs to variable costs in this industry is 
among the highest in the manufacturing sector. Economies of scale 
are far from being exhausted. Under these circumstances the possibil¬ 
ity of substantial rationalization exists. With UFT a high degree of 
rationalization does in fact occur. Markups fall by 6 percent, and this 
is accompanied by increases in production runs of 14 peicent. Given 
the unexploited economies of scale that exist in the reference equilib¬ 
rium, the larger output leads to significant decreases in average costs. 
The fall in the industry price, given the very elastic export demand 
curve this industry faces, allows the industry to do quite well in export 
markets. A significant portion of the extra output is exported, as is 
reflected from the increase of net exports by over 200 percent. 

In contrast, consider a losing industry under UFT. The biggest 
loser is clothing, with a 46 percent fall in employment and a 45 per¬ 
cent fall in value added. This industry has virtually the worst of all 
possible characteristics from the point of view of surviving import 
competition. In the reference equilibrium the domestic tariff is quite 
high at 32 peicent. There are some unexploited scale economies but 
these are very moderate. The industry is very labor intensive, which 
means that increases in the wage rate significantly affect its costs. 
Nevertheless, rationalization effects do occur, with the length of pro¬ 
duction runs more than doubling, accompanied by a great deal of exit 
from the industry. However, this is not sufficient to prevent the in¬ 
dustry from declining. With the removal of the tariff and the large 
fall in the price imports the industry faces strong foreign competition. 
Import substitution on the part of domestic consumers and firms 
leads to a large increase in the volume of imports. 


Multilateral Free Trade (MFT) 

Aggregate Results 

The aggregate results of the MFT experiment are presented in table 
1. In comparison with UFT it is seen that all of the aggregate variables 
experience larger relative changes. Under MFT the real income gain 
to Canadians, measured by the Hicks Equivalent Variation, is about 
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$13.2 billion or 8.6 percent of reference ONE. rhis welfare gain is 
more than double that achieved under LIFT. The increase in the wage 
is also quite striking, an increase of 25 percent. Gross national expen¬ 
diture increases by about $20 billion in the new equilibrium—a rela¬ 
tive change of 12.5 percent. The primary source of these gains of real 
income is again the rationalization effects within industries and the 
resulting intersectoral reallocation of resources from a comparative 
advantage point of view. Evidence of rationalization within industries 
is apparent from examining what happens to the length of produc¬ 
tion runs and average fixed costs. The length of production runs 
increases by 67 percent while the index of average fixed costs falls by 
30 percent. The results these rationalization effects have on produc¬ 
tivity are quite dramatic. Labor productivity rises by 33 percent, total 
factor productivity by a more modest 10 percent. 

The magnitude of the intersectoral resource shifts induced by MFT 
is not large. Six percent of the labor force is reallocated intersector- 
ally. What is interesting is that the pattern of intersectoral shifts is, as 
in the case of UET, quite the opposite of what one would expect from 
a traditional comparative advantage point of view. In terms of em¬ 
ployment, the manufacturing sector gains at the expense of the rest of 
the economy. Total employment in manufacturing increases by 12 
percent. 

A dramatic increase in trade volume accompanies the move to 
MFT. Trade volume increases from $84 billion in the reference equi¬ 
librium to about $160 billion in the new equilibrium, a gain of 89 
percent. As in the UFT case, under MFT the manufacturing sector 
moves from an initial deficit position in the trade account to a surplus 
position. The resulting change in intraindustry trade is negligible, 
suggesting that the increase in trade volume is accounted for, 
roughly, by an equal increase in intraindustry and interindustry 
trade. 

How do the multilateral cut results relate to the results one might 
obtain for world free trade? Since we do not have a world model, we 
can only offer an educated guess. There are two effects to consider, 
both of which are likely to increase the Canadian gains to free trade. 
First, world free trade, as it is likely to be income creating, should not 
result in adverse shifts in Canadian export demand equations on 
average. Second is the question of terms-of-trade shif ts. One probable 
scenario is that real world wages would, on average, tend to rise more 
than the real rental rate on capital services. As Canada, in 1976, was a 
net capital service importer and net commodities exporter, this terms- 
of-trade effect should in general raise Canadian real income. We 
therefore tentatively suggest that the welfare results of the multilat- 
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;raJ tariff cuts underestimate the true 1976 gain to Canada from 
vorld free trade. 


Industry Results 

In table 3 the industry results from the MFT experiments are pre¬ 
sented. In terms of gross output almost all industries expand their 
production. Indeed, 15 expand their output by over 25 percent. The 
four industries that contract under MFT are within the manufactur¬ 
ing sector. The results in terms of value added are similar, though less 
pronounced. Twenty industries increase their production of value 
added, and of these 19 increase by over 10 percent. Although most 
industries expand their production, measured by either gross output 
or value added, the impact on employment is not as uniform. Only 13 
industries actually increase their employment under MFT. In the 
manufacturing sector the results are particularly diverse, with only 
nine industries increasing their employment. 

In examining the productivity and cost efficiency statistics the im¬ 
pact of rationalization effects is quite apparent. In the manufacturing 
sector, all industries increase the length of their production runs. In 
the case of 17 industries, the increase is greater than 50 percent. The 
lengthening of production runs is accompanied by a fall in both scale 
elasticities and markups in all industries. Clearly an important impact 
of MFT is the realization of unexploited scale economies, gains larger 
than those achieved under UFT. 

In terms of winners and losers, the industry that does best on the 
basis of employment and value added is transportation equipment. 
The reasons why transportation equipment does so well are the same 
as those discussed under UFT. T his industry does better under MF T 
than UFT, of course, because its competitive position in world mar¬ 
kets improves with the removal of foreign tariffs. The industries that 
do poorly under MFT—leather, knitting mills, clothing, and furni¬ 
ture—share in common the characteristic of a very labor intensive 
technology. Clearly, the 25 percent increase in the real wage puts 
these industries in a very poor position relative to imports. Indeed in 
one case, that of knitting mills, import competition is so great that the 
industry is virtually wiped out. 


IV. Sensitivity Results 

In tables 4 and 5 the results of some sensitivity tests with respect to the 
key parameters in the model are presented. Four sets of sensitivity 
experiments were conducted. First, the import elasticities are varied 
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12. Primary metals 37.53 20 69 -44.37 21.92 5.77 -34.51 

13. Metal fabricating 22.61 10.51 -295.93 25.93 8.23 -47.06 
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TABLE 4 

StNsiiiviiv Anaiysis: Cnilaikral Frit Traoe 


Elasticity <>I Imports Scaling Parameter MSCAI. 


MSCAI. 

.33 

66 

1.0 

1.33 

1.66 

Wclfate gam 

1 65 

2 76 

4.18 

5 96 

8 27 

Wage 

24 

4 69 

9 98 

16.36 

23 42 

t ratio volume 

12.20 

29 40 

53.14 

82 05 

110.75 

La but protbit luiry 

5 62 

11 37 

19 57 

30 06 

41 20 


blast k tty ol Exports Staling Paratnctei XSCAI. 

XSCAb 

17 

33 

66 

1.0 

2 0 

Welfare gam 

-I 28 

4 25 

4 19 

4 18 

3.97 

Wage 

1047 

10 36 

10 16 

9.98 

9.48 

Haile volume 

53.60 

53.58 

53.36 

53 14 

52.61 

1 abor pi Dtltutivity 

19 59 

19.58 

19.57 

19 57 

19 56 


Minimum Efficient 

Stale Staling Parameter 

NSCAL 

NSC U. 

.33 

.66 

1.0 

1 33 


Welfaie gam 

2 51 

3 23 

4.13 

5 40 


Wage 

4 31 

6 85 

9.98 

14.05 


1 1 aile volume 

33 2 1 

10 97 

53 14 

67 65 


l.aboi pioitui uvuy 

7 04 

1 ! 32 

19 57 

31 88 



Pi King Hypothesis Paianielei PSI.AI. 


PSCA1. 

9 

4 

5 6 

7 

Well .11 e gam 

1 9\ 

3 29 

4 13 5.14 

7 85 

Wage 

:» r >() 

8 25 

9 98 12 20 

18 96 

1 1 atle volume 

Ul L>7 

52.02 

53 14 54 41 

59 38 

L,i6c» piotlmtivilY 

If) w 

18 26 

19 57 2138 

27.55 

StllF - - 1 111 * JM'IM lllagt 
eat hi im* ills values of all v 

«It.lilies it poi it tl m 
.ding pal atm t< is ati 

t aih < oliiiun ait n 
ai tin 11 hast" values. 

I'l.tllvc* 10 [lit .1ll-t.11 itlv-ui-)> 1 atc 1 
< x, »‘|»t lui lilt p.it.11111 tor undci 

;‘«|tulil)i uini In 
(otistdrr.it ion 


by uniformly M illing them up and down by a factor of proportionality 
refetred to as MSCAI.. l he higher the import elasticity, the more 
prone domestic industries are to import competition. Furthermore, 
the import elasticity determines the extent to which foreign and do¬ 
mestic goods are viewed as substitutes. If these goods are highly sub¬ 
stitutable, then it will have tire effect of raising the price elasticity of 
export demand hut not the foreign tariff elasticity. I he second sen¬ 
sitivity study is on export elasticities; these ate all scaled up and down 
by a parameter tailed XSCAL. A 11 increase in the export elasticity will 
increase the price and foreign tariff elasticity of domestic export de¬ 
mand. I he third sensitivity experiment is on the economies of scale 
estimates; the adjustment parameter is referred to as NSCAL. The 
final sensitivity experiment is with respect to the pricing hypothesis of 
firms in noncompetitive industries. Recall that the price set by firms is 
a weighted average of the prices determined by the monopolistically 
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TABLE 5 

Sensitivity Analysis: Multilateral Fkee Trade 


Elasticity of Imports Scaling Parameter MSCAL 


MSCAL 

33 

.66 

1.0 

1.33 

1.66 

Welfare gain 

8 96 

7.96 

8.59 

10.07 

12.19 

Wage 

22.88 

22.85 

25.21 

29.51 

34.77 

Tiade volume 

55.09 

68.13 

88.60 

114 18 

140.00 

Labor productivity 

23.77 

26 47 

32.62 

41.60 

51.54 



Elasticity of Exports Scaling Parameter XSCAL 

XSCAL 

.17 

.33 

66 

1.0 

2.0 

Welfare gain 

4.89 

5 50 

6 90 

8.59 

16 70 

Wage 

12.53 

14 60 

19.46 

25 21 

53 24 

Trade volume 

58.75 

63.96 

75.38 

88.60 

146.45 

Labor productivity 

21.46 

23 42 

27.74 

32.62 

52 27 


Minimum Efficient .Scale Scaling Parameter NSCAL 

NSCAL 

.33 

66 

1.0 

1 33 


Welfare gain 

6 II 

7.14 

8.59 

10 83 


Wage 

18.07 

20.90 

25.21 

31 42 


Trade volume 

67.60 

74.42 

88 60 

108 23 


Labor productivity 

18 52 

22.88 

32 62 

48.03 




Pi icing Hypothesis Parameter PSCAL 


PSCAL 

.2 

.4 

.5 

.6 

.8 

Welfare gain 

4.31 

6.87 

8.59 

10 75 

16.32 

Wage 

14.73 

20.95 

25.21 

30.85 

50.84 

Trade volume 

70.92 

82.07 

88.60 

96.55 

122.81 

l alior productivity 

22 26 

28.44 

32 62 

37.99 

56.98 


\on — Nrt* table 4 u 


competitive price pMc.im and the Eastman-Stykolt price /»lsil The for¬ 
mula is: 

p = (1 - PSCAL) • />ml.ph + PSCAL • p KSH 0 « PSCAL =£ 1. 

Various values of the parameter PSCAL are considered, with in¬ 
creases in PSCAL leading to a higher weight being attached to the 
more collusive ESH price. 

The values of the parameters MSCAL, XSCAL, and NSCAL are all 
set equal to one in the reference equilibrium, while the parameter 
PSCAL is set equal to 0.5. First, consider the impact of altering the 
value ol MSCAL. Letting MSCAL vary from 0.33 to 1.33 results in 
the real income gains of UFT varying from a low of 1.6 to a high of 8 
percent—quite significant changes. Under MFT the range in which 
the measured real income gains vary is smaller: from 7.9 to 12 per¬ 
cent. In this case the welfare gain actually falls and then rises. The 
lower gains to MFT occur for values of MSCAL close to the base 
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values. As one might expect the volume of trade is quite sensitive to 
changes in the import elasticity, for both UFT and MFT. Labor pro¬ 
ductivity is also quite sensitive to altering the import elasticities. In the 
case of UFT the variation in labor productivity is quite large, varying 
from a value of 5.6 to 41 percent. The explanation of this is 
straightforward: import competition will lead to more industry 
rationalization the greater the extent to which foreign goods actually 
compete with domestically produced goods. The results from this 
experiment indicate that the gains to UFT are quite sensitive to the 
actual values of the import elasticities, more so than the MFT results. 
For all values of the import elasticities the gains in real income under 
MFT are equal to or greater than 8 percent. 

In the export elasticity experiment, the value of XSCAL varies 
from a low of one-third to a high of twice the base value. Under UFT 
the measured gain in welfare is very insensitive to variations in 
XSCAL. The welfare gain varies slightly, from a low of 4 percent to a 
high of 4.3 percent—quite insignificant. However, this is not true 
under MFT. In this case the gain in welfare varies considerably, from 
a low of 4.9 percent to a high of 17 percent. Particularly in the upper 
range of the export elasticities the welfare gains get quite large. The 
industry results are not reported, but for the higher values of the 
export elasticities, a number of domestic industries shut down. The 
rationalization effects become quite strong and the economy moves to 
a more specialized pattern of production. With regard to trade 
volume and labor productivity the results parallel those of the welfare 
measure. Under UFT, both indexes show little variation while under 
MFT the indexes are quite sensitive, both increasing substantially with 
increases in XSCAL. Trade volume increases from a low of 58 per¬ 
cent to a high of over 140 percent; labor productivity increases from 
21 percent to 52 percent. Overall this experiment suggests that values 
of the export elasticities are quite significant to determining the 
benefits to MFT but insignificant in the case UF T. 

The results of the economies-of-scale experiments are of consider¬ 
able interest. Recall that the cost functions of the noncompetitive 
firms are scaled so as to reproduce observed MES and CDR estimates. 
The extent of economies of scale will thus vary directly with the mag¬ 
nitude of these estimates. The base estimates lie midway between the 
econometric and average engineering estimates. Since there is a great 
deal of controversy surrounding the reliability of these estimates, the 
parameter NSCAL was varied from a low of one-third the base value 
to 1.33 times the base value. Not surprisingly, both the UFT and MFT 
results are sensitive to these variations. Increasing NSCAL results in 
substantial increases in the volume of trade and labor productivity for 
both trade policies. Indeed, under MFT, labor productivity increases 
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by 48 percent when NSCAL reaches 1.33. The estimated welfare 
gains are also quite sensitive. For LIFT, the welfare gain varies from a 
low of 2.5 percent to a high of 5.4 percent. Under MFT, the gain 
varies from a low of 6.1 percent to a high of 10.8 percent. For values 
not reported, as NSCAL increases beyond 1.33 a number of indus¬ 
tries cease to operate, under both UFT and MFT. As in the case of 
high values of the export elasticities, a movement to freer trade in¬ 
volves increased specialization within the manufacturing sector of the 
economy. It is the labor-intensive, low-economies-of-scale manufac¬ 
turing industries that shut down. In summary, the benefits of both 
UF T and MFT are sensitive to assumptions about the degree of scale 
economies, although in both cases they are positive and significantly 
so. Even extremely conservative estimates of scale economies yield 
welfare gains to MFT of 6 percent. 

The last set of experiments conducted report the findings of vary¬ 
ing the pricing hypothesis parameter PSCAL. The nature of the pa¬ 
rameter PSCAL is different from that of the others in that it in¬ 
fluences the actual pricing behavior of domestic firms. The value of 
PSCAL ranges from a low of 0.2 to a high of 0.8, low values weighting 
greater than the MCPH hypothesis and high values the ESH. The 
results of the model turn out to be quite sensitive to changes in 
PSCAL. The gain in welfare varies considerably. Under UFT the 
range is from 2 percent to a high of 7.8 percent. For MFT the varia¬ 
tion is even larger and increases from a low of 4 percent to a high of 
16.3 percent. The results for trade volume and labor productivity 
follow a similar pattern, with high values obtaining with increases in 
PSCAL. At the industry level a number of industries begin to shut 
down as the ESH in pure form is more closely approximated; this 
holds true for both UFT and MFT. Again the first industries to close 
down are the labor-intensive ones as the economy begins to specialize. 
The results of this experiment indicate that the impact of UFT and 
MFT is sensitive to the underlying behavioral hypothesis in the non¬ 
competitive industries. Larger benefits to freer trade are achieved the 
greater the degree to which domestic industry responds to foreign 
price competition. 


V. Conclusions 

I his paper has investigated the potential impact of two trade liberal¬ 
ization policies on the Canadian economy. Important features of the 
general equilibrium model used to examine these questions include 
the presence of economies of scale and imperfect competition within 
the manufacturing sector. Using a 1976 data set, we performed 
counterfactual policy experiments with the model. The findings of 
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the experiments indicate that trade liberalization would provide large 
benefits to the Canadian economy. For a wide range of parameter 
values, the welfare gains from a unilateral free trade policy were 
found to be in a range of 2—5 percent of CNP, while the benefits to 
multilateral free trade were found to lie in the range of 8—10 percent 
of GNP—-numbers much larger than conventional estimates. The 
mechanism through which many of these benefits are achieved is 
intraindustry rationalization. Indeed, the intersectoral reallocation of 
labor was found to be reasonably small under both trade policies, 
suggesting that the adjustment costs of adopting a free trade policy 
may not be laige. 

The incorporation of economies of scale and imperfect competition 
within a general equilibrium model provides an additional channel 
through which resource allocation is determined, a possibility that 
does not exist in conventional trade models. The results of the experi¬ 
ments reported here indicate that for the Canadian economy intrain¬ 
dustry rationalization is an important channel through which benefits 
from trade liberalization ate achieved. The message seems to be that 
economic models that ignore the industrial organization aspects of 
the economy, at least small open economies, may seriously underesti¬ 
mate the benefits to trade liberalization. If one accepts economies of 
stale and imperfect competition as relevant to small open economies 
with a significant industrial sector, the policy implications are clear: 
the costs of protection are very high and efforts to promote free trade 
are well founded on grounds of gains in economic efficiency. The 
results were shown to he sensitive to the trade elasticities and scale 
economy parameters. Because of the great importance of these pa¬ 
rameters, it is ol some importance to get reliable estimates. The over¬ 
all significance ot the methodological framework used in this paper 
will he fully known only when similar studies are carried out for other 
small open economies. 


Appendix 

This Appendix outlines the equations of the model. For the sake of brevity 
the model will be detailed without (axes, tariffs, or subsidies. In the empirical 
implementation of the model most of the relevant tax and tariff distortions 
are present; see Harris (1984i>) for more details. 

1. Notation 
N: 

C: 

8 : 

M = N U C, C = MUfl 


index set lot noncompetitive industries 
index set lor competitive industries 
index set for noncompeting imports 
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( p,)ieM domestic commodity prices 

(pt)ieM foreign commodity prices 

((/,), eK foreign prices on noncompeting imports 

domestic wage 

r: world rental rate on capital 

p - (p, p*, q , r, w): price system. 


2. Tethnology 

All firms have a variable unit cost function V'(P), assumed independent of 
the level of output, of the form 

log V‘(P) = a,,, + 2 >eM a, ; log wj + £*<=„ log q,k + a,„, log w + a, r log r. 

(Al) 


wj is the price index of a composite input used by industry 1 , a composite of 
both foreign and domestic inputs from commodity wj is assumed to have the 
form 


log wj = log p, + (1 - P y ) log pj . (A2) 

If price-taking behavior in input markets is assumed, the input-output matri¬ 
ces for the economy are derived from the unit cost functions by applying 
Shepard’s lemma. The domestic Leontief matrix A(P) = [n, ; (F)] is defined by 

yn - (as) 

p, 


where a,, is the demand for domestic good /, per unit output of domestic good 
1 The Leontief matrix for foreign imports A*(P) is defined as 


_*/ n \ _ V * ^)V(P) 

1 .* 

Pi 


(A4) 


The lixed costs of each representative firm in each noncompetitive indus- 
tiy, / £ N, are given by the function 


l\(r, w) = rj' K + wJ' L . (A5) 

where / k and J J ate the minimum amounts of capital and labor, respectively, 
needed to set up a plant. 


3. Exports 

We assume there is a world demand for a composite export good (?, which is 
an aggregate of all relevant countries’ exports. The composite export good is 
defined by 

Q = [0X'* + (1 - P)X* X J (A6) 

where X is domestic exports and X* is all other countries’ exports; o* = 1/(1 
+ \) is the elasticity of substitution between domestic and other exports in 
world demand. Letting p and p* be the prices paid by world consumers of 
domestic and other exports, the dual price index for the export composite is 

P = [fry 1 -"** + (I - p)<y*< | (A7) 

For a given Q, cost minimization yields the demand for domestic exports 

* - 


(A8) 
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Assuming a constant elasticity demand for world exports of the form 

(l = aP~‘ (A9) 

yields, on substitution into (AM), the demand for domestic exports given by 


X 


P" 


(A 10) 


4. Dome\lu Final Demand 

The consumer's utility function over commodity aggregates is given by the 
log-linear form 


log U — a,) + log (All) 

For tionioinpettng imports, i E B, C, represents the amount of the import 
good. For all other industries, t E M, C, is the CES aggregator over foreign 
and domestic goods: 

C, - |&,*•/' + (1 - 5,)X? ,, ']" P '. (A 12) 


(liven disposable inc ome Y and prices P, the demand for domestic good X, is 
given by 


X, = 


a, Yh\’'p"' 

s ?pj-* + (i - &,)■>;* "'■ 


(Al») 


5. Shorl-Hun Ki/mlilmum 

The industiy stiuctut e variables held constant in the short run are markups 
on unit costs by firms, i E A, (»«,) = m, number of firms in each industry, i E 
A, (Fm,) = Fm. Let -S' = ( m, Fm) be the vector of structural variables. Aggre¬ 
gate consumer income is given by 

Y = wL + >K„ + t r„ (A 14) 


wheie I. is the aggregate labor endowment, K[, the domestic capital endow- 
ment, -it, the short-run profits or losses in industry t E A, and T the share of 
domestic ownership in industry. 

F.quitibmim commodity prices are determined by the equations 


p, = m,V(P), iEN 
p, = V’(P ), i E C. 


(A 15) 


Letting X(P, Y, S) and E(P) denote final demand vectors, commodity market 
clearing implies the vector of total outputs, Z, must satisfy 

Z = [/ - A(P) r r'[X(P, F, -S') + E(P) J. (A16) 


Given the vector of domestic outputs, labor market equilibrium requires 

L = 2, eM a,JP)Z, + Fm, ■ /}., (A 17) 

where a, u . is the labot requirements coefficient in industry i. Industry profits 
it, are 



F,(r, tit) 


(A18) 
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A short-run equilibrium for a given 5 is a wage w(S), domestic commodity 
price vector p(S), income Y(S), and vector of outputs Z(S) satisfying (A 15)— 
(A18). Walras's Law implies a balance of payments equilibrium of the form 

2, eM p.E, - = r(K - K n ) + (1 - 'P)2 ie/ v it,, (AID) 

where K is the total domestic demand for capital services. 


6. Finn Behavior 

Under the monopolistically competitive pricing hypothesis, each firm in in¬ 
dustry i £ N perceives an industry demand curve of the constant elasticity 
form 


z, = v,pr l ‘. 


(A20) 


Under the assumption that industry demand is shared equally among firms, 
the optimal pricing rule is given by 


p, ~ V‘ = 1 

p, Em,(, ' 


(A21) 


In the long run the perceived elasticity is equated to the elasticity of the true 
demand curve, which is given by 


„ _ (*.) _■ , <£,) 

■ (Z.) ^ (Z.) 


. y (/;> . 

+ s ' eA, 7^y^ 


(A22) 


where t|i is the elasticity of final demand, is the elasticity of exports, T) J is the 
elasticity of intermediate demand, and I', ~ a n 7. r intermediate use of com¬ 
modity i by industry j. 

Under the Eastman-Stykolt pricing hypothesis, 

p, = </,*(! + t,), (A2 3) 

where I, is the domestic tariff. 
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Transactions Costs and the Optimal Quantity 
of Money 


Scott Freeman 

Hn\lnn f.W/egr 


1 Ins paper models a physical environment in which agents rationally 
choose to hold both fiat money and higher return capital. This oc¬ 
curs because inhumation barriers permit bona fide capital to be 
distinguished from bogus titles to capital and bogus lOUs only at a 
cost One implication is that the holding period lor capital is greater 
than that for hat money. T he inhumation barriers also limit the 
government interventions that are leasible and, in so doing, over¬ 
turn standard notions concerning the optimal quantity of money. 


In this paper, I model a physical environment in which agents ration¬ 
ally choose to hold both fiat money and a productive asset with a 
gteater late of leturn. Moreover, the fiat money in this model circu¬ 
lates with a gt eater velocity than does the pioductive asset. 

In this model, agents live 3-period lives and are members of gener¬ 
ations that overlap. They receive an endowment of the model’s single 
consumption good only in the first period of life and must, therefore, 
hold assets to provide for consumption in other periods of life. When 
the consumption good is stored, it only yields a return 2 periods after 
it was stored. As a result, storage can be used to provide for consump¬ 
tion in the third period of life. 

The crucial assumptions of the model are restrictions on the infor¬ 
mation available to agents that hinder the efforts of agents to finance 
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second-period consumption through borrowing or the sale or pur¬ 
chase of “half-baked” storage. I assume that the environment permits 
the existence of bogus IOUs and bogus titles to storage that can only 
be distinguished from bona fide assets at a cost. These “transactions 
costs” are incurred only if these assets are exchanged. When these 
costs are sufficiently high, agents who offer IOUs and titles to storage 
will find no buyers for these assets at positive prices. In this case, since 
fiat money is assumed not to suffer from the recognition problem, 
agents will hold it alone to provide for consumption in the second 
period of life. If the 2-period rate of return on storage exceeds that of 
fiat money, the model will display an equilibrium in which agents hold 
a pioductive asset over a long period of time because of its high rate 
of return while they meet interim needs with fiat money because of its 
low transactions cost. 

The difference in the rates of return of the two assets in this equi¬ 
librium implies that the marginal rates of substitution differ for 
agents alive at the same time, suggesting that the equilibrium with a 
constant supply of fiat money (hereafter called laissez faire) may not 
be Pareto optimal. I consider a deflation of the money supply as a 
means of erasing differences in rates of return and marginal rates of 
substitution. An important feature of the model is that the informa¬ 
tion barriers essential to the restriction of private trades of IOUs and 
capital prevent the government from financing deflation from the 
return of government-held capital. 

A government policy consistent with these information restrictions 
is deflation of the stock of fiat money financed by lump-sum taxes on 
agents. For this type of policy I will establish the following welfare 
propositions: (i) deflation cannot provide the steady-slate lifetime util¬ 
ity that would have been achievable in an economy with no informa¬ 
tion barriers; (ii) deflation lowers steady-state utility if financed by 
taxes on the young or middle-aged; (iii) some degree of deflation will 
increase steady-state lifetime utility if financed by taxes on the old; (iv) 
deflation to achieve rate-of-return equality may result in lower steady- 
state lifetime utility than laissez faire, even if financed by taxes on the 
old. 

I present the basic model in Section I. In Section 11, 1 establish the 
welfare propositions above and compare the implications for the op¬ 
timal quantity of money to those of the neoclassical literature. I con¬ 
clude in Section III. 

I. The Basic Model 

Agents live 3 periods and are members of overlapping generations as 
in Samuelson (1958). The utility of each agent is an additively separa- 
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ble f unction of his consumption of the economy’s single good in each 
period of the agent’s life. The utility f unction, identical for all individ¬ 
uals, is strictly concave and continuous with continuous positive first 
partial derivatives with respect to each argument. The first partial 
derivative with respect to each argument is infinity when that argu¬ 
ment is zero. Agents are endowed with y units of the consumption 
good in the first period of life and with nothing in other periods. I 
denote the number of people born at period l by N(t) and assume that 
N(l) = nN(t - 1), w'here n is a positive constant. 

If agents put h(t) units of the consumption good into the storage 
process at time t, 0 ck{t) units of the consumption good will be received 
from storage at time t + 2. The parameter a is a positive constant 
greater than n. There is no return from these stored units in any 
other period. Fractional storage is possible and goods stored are per¬ 
fectly divisible. 

In addition, agents can constantly and costlessly create worthless 
fake storage that appears identical to bona fide storage to all but its 
creator. Other agents may learn whether storage is worthless only by 
using up some amount of the consumption good. This investigation 
tost is assumed to be large enough to deter such investigations. An 
agent’s chosen levels of consumption, storage, and money holdings 
are private information. In this environment no one would ever offer 
to purchase storage. An offer to purchase such units at a positive 
jtrice would only induce a Hood of fake units. 

An individual’s age is assumed to be known only to that individual. 
In addition, I assume that the amount that an agent has borrowed 
from others is not known. Note that each agent in his last period of 
life will wish to issue an 1C)U at any interest rate to every potential 
lender. Because potential lenders cannot distinguish the lOUs issued 
by agents who will still be alive when their notes are due from those 
issued by people in their last period of life, no private IOUs will be 
accepted if there is an alternative asset (such as fiat money) with a 
nonnegligible rate of return. 


Laissez-Faire Monetary Equilibria 

A monetary equilibrium is identified to be a perfect-foresight compet¬ 
itive equilibrium in which fiat money has value. Fiat money is defined 
as intrinsically useless, noncounterfeitable pieces of paper that are 
costlessly produced by the government. They can be costlessly stored, 
costlessly identified, and costlessly transferred from one individual to 
another. In a laissez-faire monetary equilibrium there is a fixed sup¬ 
ply of fiat money. Let M represent the number of units of fiat money. 

Given the information structure assumed above, agents are unable 
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to store, borrow, or lend to provide for consumption in the second 
period of life and must therefore purchase fiat money when young 
and sell it when middle-aged. I will demonstrate below that if a > n 
(as assumed), the equilibrium rate of return of fiat money in a steady- 
state equilibrium is less than the rate of return on storage. In this case 
agents will store to provide for all their consumption in old age, 
generating the phenomena listed in the Introduction. This transac¬ 
tion pattern, but featuring 2-period bonds with transactions costs, is 
also displayed in Martins (1980). 

Therefore, the information barriers justify the following formula¬ 
tion of the choice problem of a representative young person born at t 
when a exceeds the rate of return on fiat money. Such a person 
chooses nonnegative k(t), the quantity of the consumption good to be 
stored at /, and m(t), the number of units of fiat money to be pur¬ 
chased at t, to maximize t/[c ( (0. r 2 (0> c 3 (<)] subject to c t (0 = y ~ k(t) — 
p(l)m(t)\ with c 2 (t) — p(t + l)m(t); and c^(t) = a 2 k(t), where c,(l) is the 
consumption of a representative individual of generation t in the 2 th 
period of life and p(t) is the price of a unit of fiat money at time I in 
units of the time t good. Let the initial period be numbered zero. 
Those who are middle-aged in period 0 begin with all the fiat money 
and with stocks of storage that will mature in period 1. I assume that 
these stocks are large enough so that the middle-aged at l = 0 will not 
want to hold fiat money for consumption at t = 1. 

Given these conditions there exists a steady-state monetary equilib¬ 
rium in which each agent, regardless of period of birth, faces the 
same 1 ate of return on fiat money (n), purchases the same level of real 
balances of fiat money (call it li) when young, and sells all his fiat 
money when middle-aged. The market-clearing condition in this case 
is p(l)M = N(t)h for t > 0. The rate of return of fiat money in this 
monetary equilibrium is therefore 

P(t + 1) N(t + 1 )h/M 
p(t) N(l)k/M 

Since a 2 is greater than the 2-period rate of return of fiat money, n 2 , 
agents store to provide for all their consumption in old age. 

Environmental features that inhibit the exchange of private debt 
and thus arbitrage are essential for differences in rates of return. 
Nevertheless, equilibria with valued, high-velocity fiat money could 
still be observed in economies with private debt. Imagine that the 
return at t + 2 of goods stored at t is a function a[A(/)], where again 
k(L) is the number of goods stored by an individual at time l. Let a'( ) 
> 0, a (0) = 00, and a"( ) < 0. In such a model all agents would hold 
positive amounts of storage. Again, because fake storage cannot be 
distinguished from real storage, no agent will be able to sell storage 
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that has not yet matured. Therefore, to secure consumption in the 
second period of life, one must lend units of the consumption good 
when young or borrow against the storage that will mature when the 
agent is old. Arbitrage will equate the marginal rates of return on 
storage and loans. If the 1-period rate of return in a nonmonetary 
equilibrium is less than n, a monetary equilibrium with a fixed supply 
of fiat money is possible. In such an equilibrium, storage would be 
held only from the first to the third period of life, while lOUs and fiat 
money ("inside” and "outside” money) will circulate more rapidly 
because they will be used to acquire consumption in the second period 
of life. An example of such an equilibrium is given in Freeman 
(1983). 

II. The Optimal Quantity of Money 

In Samuelson’s basic overlapping-generations model (1958) there ex¬ 
ist Pareto-optimal equilibria with a fixed supply of valued fiat money. 
However, in the neoclassical model of Samuelson (1968), Tobin 
(1968), and others, equilibria in which capital earns a rate of return 
greater than that of fiat money—equilibria not displayed by the basic 
overlapping-generations model—are nonoptimal. In such equilibria 
agents forgo some amount of the services of fiat money because of the 
opportunity cost of holding it. Such efforts to economize on money 
balances are unnecessary, it is argued, because real money balances 
can be produced at no cost in resources, for example, through a 
deflation of the money stock. Samuelson asserts that the holding of 
larger real money balances “would end up making everybody better 
off" (1968, p. 10). 

In the neoclassical models, real money balances are inserted as a 
direct argument in agents’ utility functions. Although it is generally 
not believed that the holding of assets called money increases utility 
except by allowing the agent to achieve a preferred consumption 
bundle, the practice is justified as a convenient shortcut to important 
results (Samuelson 1968, p. 8). Shortcuts, however, must be ques¬ 
tioned if models that are more specific about the economic environ¬ 
ment yield different results. 

In this section, I will establish examples of economies in which the 
use of the neoclassical shortcut is not justified in that deflation is not 
Pareto superior to laissez faire. My model is an appropriate source of 
these counterexamples because it duplicates the essential features of 
the world described by Samuelson, Tobin, and others. In my model, 
because a > n, fiat money in fixed supply has a lower rate of return 
than capital (storage) but is valued because it provides a service, the 
provision of second-period consumption, that cannot be provided by 
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other assets because of their transactions costs. The utility derived 
from this service can be considered an analog of the neoclassical mar¬ 
ginal utility of money. In addition, as I will now show, laissez faire 
appears to be nonoptimal by the standards of a world without infor¬ 
mation constraints. 

A necessary condition for Pareto optimality in a world without 
information restrictions is the equality of the marginal rates of inter¬ 
temporal substitution of all agents living in the current period and the 
next. In my model if a > n, the rates of return of storage and hat 
money differ and theiefore so will marginal rates of substitution. 

Any increase in the rate of return of money to reduce the differ¬ 
ence in rates of return must be financed from tax revenues or the 
return on a government asset. However, the government cannot back 
its debt with storage without running into the very information con¬ 
straints that prevent private agents from issuing notes barked by stor¬ 
age. If the government asked individuals to store on its behalf, the 
middle-aged and old would consume the goods entrusted to them 
and store fakes in their place. 

A feasible government policy is a deflation financed by lump-sum 
taxes in every period after period 0 of T , goods on each young agent, 
7a <>n each middle-aged agent, and 7', on each old agent; that is, 

TiN(t) + T 2 N(t - 1) + T^N(t - 2) = p(t)[M(t - 1) - (1) 

Note (hat if the age of each agent is private information, the only tax 
consistent with the information constraints is /', = /' 2 = 7V Since one 
might nevertheless envisage environments in which ages are known 
but there is still no private debt, I will show that my welfare results do 
not depend on the absence of age-specific taxes by allowing the gov¬ 
ernment to impose such taxes. Let M(t) = zM(t) with z constant. 

I want to distinguish four types of people af fected by the interven¬ 
tion: the current old, the current middle-aged, the current young, 
and future generations. Note that the choice problem of the current 
young differs from that of future generations in that the current 
I young are not taxed in their first period of life (at t = 0). Again, the 
[current middle-aged are assumed to start with all the fiat money and 
^sufficient stocks of storage so that they do not wish to carry fiat money 
into their third period of life. 

^ ^ w ^~ ant ^ §‘ ven these initial conditions, there exists a monetary 

equi i iriutn—to be called the steady-state equilibrium—that enters a 
steady state in period I. In the steady-state equilibrium all agents born 
a tel peiiod 0 face the same rate of return on fiat money and pur- 
z tase t e same amount of real balances of fiat money. In addition, all 
agents born in or after period 0 sell all their fiat money when middle- 
J g e ■ e market-clearing condition in the steady state, p(t)M(t ) = 
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N(t)h, gives us that the steady-state rate of return on fiat money is n/z. 
This verifies that agents born after period 0 will not wish to hold fiat 
money tor consumption in old age for a > n/z. Agents horn in period 
0 will also sell all their fiat money when middle-aged if p(2)/p((i) < a. I 
show in the Appendix that this inequality is satisfied for n/z < a. 

I will now establish four propositions about the welfare implications 
of such a deflationary policy. 

Proposition 1: The steady-state lifetime utility of a world without 
information barriers cannot he achieved through deflation. 

Proof: In a world without information barriers, storage can be 
freely traded so that agents could get the 1-period rate of return a 
from storage. An agent’s lifetime budget constraint would be v ^ Ci(f) 
+ [cy(f)/aj + [ti(f)/or]. If the government owned storage with which 
to finance the return on a government asset, this budget set would be 
achievable. However, in the model with information barriers the 1- 
penod rate of leturn a is available to agents not because they store 
extra goods but because the government taxes agents to increase the 
leturn of fiat money (unbacked by government storage). As a result, 
the after-tax endowment of agents is reduced by the present value of 
the taxes. To see this, ret ail the budget constraints of agents: c t (l) = 
y - h«) - k(t) - T i, r 2 (0 = [pit + \)lp(t)\h(t) - T>, and < ; ,(f) 
= a~k(t) - /',. An agent’s lifetime budget constraint in my model is 
therefoi e 


V - 7-, 



r, 

•> 

or 


<AD 


i <-AD , fs(0 

a or 


Although deflation of the money supply can give agents the rate of 
return they could get in the unrestricted world, it cannot give them 
the same level of utility because of the taxes required for deflation. 
Q.h.D. 

I he effect of deflation on steady-state lifetime utility is explored in 
propositions 2, if, and 4. They establish that steady-state lifetime util¬ 
ity depends crucially on the timing ol the tax and the resulting level of 
storage. 

Steady-state lifetime utility can be expressed as the following con¬ 
tinuously differentiable function of z when z ? n/a: 




U y - h(z) - k(z) - r,(2). (y)*(2) - T 2 (S), 0 ck(z) - 7,(2) , 

( 2 ) 


where T\{z), T 2 (z), and 7’,(z) satisfy the government budget constraint 
(1) and where h(z) and k(z) are the reaction f unctions of agents to the 



TRANSACTIONS COSTS I53 

specified government tax policy and the given level of z. The reaction 
functions satisfy the first-order conditions: 


— V\(y — h - k - T 1 ) + 




= 0 


■U\(y - h - k - 7'i) 4- a 2 U^(a 2 k - 7\,) = 0. 


(3) 

(4) 


Proposition 2: Steady-state lifetime utility is greater under laissez 
faire than under any change in the money supply financed by lump¬ 
sum taxes or transfers imposed on the middle-aged. 

Proof: Proposition 2 can be proven by showing that W(z) reaches a 
maximum when z = 1. For T\ — T-\ = 0, 


y 5 = rn> - 1) - - <1 - 


x * - <■ - 


Then (« = (n/z)h(z) + 7' 2 = n/t(z) so that equation (2) can be simplified 
to W(z) = U[y - fi(z) - k(z). nh(z), ot 2 A(z)]. 


W'(z) 


- U y-r 


dh 

l dz 


dk . ., dh , ,, 

U|y + CA> —N + U$OL 

az az 


y dk 

dz 


= + nU*) + g(-(/, + a-(/,) 

from the first-order condition. Therefore z = 1 is a unique global 
maximum if dhldz < 0, which can be verified by applying the implicit 
function theorem to the first-order equations (3) and (4) (see Freeman 
1983, app. B). Q.F.D. 

Similar propositions can be established for any combination of 
and T -2 such that T$ = 0. 

Proposition 3: Some degree of deflation will increase steady-state 
lifetime utility when the necessary taxes are levied entirely on the old. 

Proof: The proposition is established if it is shown that VV '(z) < 0 
when z = 1. When = 7\> = 0, 


7 > - Ht - ')w%, = (t - ‘K** 0 - 
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Then 

•*”<=> - -< - u % + u -4§~ u 4« l) + 

-T,g± - l)n* + 

= -r.,(4)^) - - i)« 2 + 

from (3) and (4). 



Again from (3) and (4), we have that L'_> - V,/n - a’il^ln. Therefore 

H"(|) = /i(I)nl-r^<a‘/rr) + l'<] + 0 < 0. Q.E.I). 

Note that the hist term of (a) is positive for z < 1 if dh!dz < 0. 

1 here hue it iimv not be true that H’'(:) » always negative. It may well 
he that »"(«/«) > 0. which would imply that rate-of-retum equality 
( „/ z ~ a ) Js n< it the rate of deflation (hut maximum steady-state 
lifetime utility. 

Propositions ‘2 and 3 underline the importance of the timing of the 
rax that finances deflation. Deflation financed by taxes on the young 
and middle-aged lowers steady-state lifetime utility, which is also an 
implication of simpler overlapping-generations models. This result is 
overturned when the taxes are imposed on the old. The difference 
lies in the ability of agents to increase productive (a > n) storage to 
pay some portion of the tax when old. For this reason, when the tax is 
imposed equally on every agent regardless of age (T| = T- 2 = 7%), the 
result of proposition 3 is again obtained (Freeman 1983). 

Although some degree of deflation may increase steady-state 
lifetime utility when the old are taxed, this does not imply that this 
deflation is Pareto superior to iaisse/ faire. The middle-aged at t = 0 
are af fected both hy the tax levied on the old and by the change in the 
real value of then money holdings caused by the deflation. The net 
welfare effect of this deflation is ambiguous. Freeman (1983) gives 
examples of economies in which some degree of deflation is Pareto 
superior to laissez faire and in which no degree of deflation is Pareto 
superior to laissez laire. 

Note also that even if the deflation is entirely financed by a tax cm 
the old, a sufficiently large deflation may actually lower steady-state 
lifetime utility as the burden of taxation, represented by - U$(dk/dz) 
x [(l/i) - I] in equation (5), begins to outweigh the benefits of the 
increased rate of return of fiat money. As an example, consider the 
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neoclassical suggestion that deflation be used to achieve rate-of- 
return equality. 

Proposition 4: Steady-state lifetime utility may be lower under 
rate-of-return equality financed by a tax on the old than under laissez 
faire. 

A proof by example of proposition 4 may be found in the Appen¬ 
dix. The result of proposition 4 is also obtained when the government 
uses taxes that are not age specific (Freeman 1983). 

III. Concluding Remarks 

The model of this paper makes two contributions. First, it displays 
and explains important monetaiy phenomena not displayed by other 
overlapping-generations models. A central belief in monetary theory 
is that money is held by agents because it is more useful than other 
assets in conducting transactions—that is, because transactions involv¬ 
ing money are less costly than those involving other assets. Existing 
models of overlapping generations do not capture this idea except 
perhaps through the broadest analogy. In these models hat money is 
valued, in the absence of legal restrictions, only it it has a rate of 
return no less than that of any other asset. Thus, these models cannot 
generate equilibria in which agetits hold fiat money and a productive 
asset with a greater rate of return, nor ones in which fiat money is 
exchanged more often than other assets. As a result, Cass and Shell 
(1980) and Kareken and Wallace (1980) have argued that models 
displaying these phenomena in explicit models are important targets 
for research. My model is a first answer to this call. 

Second, the model generates counterexamples to neoclassical be¬ 
liefs concerning the optimal quantity of money. Its completely explicit 
environment displays all the essential features of the world discussed 
in the neoclassical literature but shows nevertheless that deflation may 
not be Pareto superior to laissez faire and that rate-of-return equality 
may result in lower steady-state lifetime utility. The model’s results 
indicate that the strong information barriers necessary for rate-of- 
return dominance may have important implications for the ability of 
the government to achieve preferred allocations. The limitations—or 
absence of limitations—in the government’s capabilities in this and 
other worlds with transaction frictions can only be uncovered by work 
that completely specifies the physical environment. 


Appendix 

Proposition: If there exists a stationary equilibrium for n/z < a, then 
p(2)/p(0) < a 2 . 
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Proof: Denote the real balances of' hat money purchased by each agent 
born aftei period 0 as It and the real balances purchased by those born in 
period 0 as h. Market clearing requires that p(2)M(2) = N(2)h and p(())M(0) 
— N(())h Then 

t£ZL = iNWMjm _ AM" 

p(0) \N(Q)IMH)))h ft \ z ) ’ 

so that p(2)/p(0) < (rdzf if h > h. Retail that the problems of the current 
young and future generations tliflet only in that the current young are not 
taxed in the hist period of life. 1 herefore, to show that h > h, I need to show 
that, for a given prite of money in the next period and for given z and T, the 
demand for real balances is greatei for a larger first-period endowment (i.e., 
a lowei tax m the first period of life). 

I he problem of agents tan be expressed as 

max f ; f}',/ - p(t)m - k, pit + I )m - T->, a' J k - T' I J, 
m,k 


wheie Y,t is the first-period endowment after tax. 

The hist-ordet conditions and rnatket dealing, m = M(t)/N(l), imply 


-paw, 


y,t -/,(,) MM k 
y N(t) 


+ p(i + I )U->[p(l + l)Af - 7' a ] = 0 


-V 


Y,, - 




+ aH’^a-k - T) = 0. 


Bv using the implicit function theorem, one can find that 

'lp(i) = _ -f'lf/n _ 

r,Y <t -1/,1't, - U t a'L' vi + a'p(t)[M(t)IN(t)\U n U-u 


Thetelore the demand for real balances when not taxed in the hist period of 
hie. h, is greater than the demand when so taxed. It. Then p(2)/p( 0) < a"’ for 
ntz < a, Q K.D 

Proof of proposi i ion 4: 1 will prove by example that rate-of-return equal¬ 
ity may result m lower steady-state utility than laisse/. lan e even if the defla¬ 
tion is funded entirely by a lax on old agents. 

I.et agents have the utility function yi ln(o) + y 2 ln(r 2 ) + yc In(r^). The 
resulting first-order conditions with a deflation of z — 11 /a paid for by a tax on 
the old can Ire written to form the following solution algorithm of the steady 
state: 

k = A + 'YntU/om) - (l/a~')l|y 

yi + yc> + y-i + y-j[(l/«ri) - (l/or)] 


~ *)7z 
7t + 72 


Under laisse? faire, k = yyy/(yi + y 2 + ys ),h = yg7^(y 1 + y 2 + yd,and T = 0. 
When these are evaluated and substituted into the utility function, we find 
that latssez faire yields a greater utility when the value of y 2 is low relative to 
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•y, —for example, when y - 1; a = 1. 1 ; n = l;*y> = I; -y 2 = .000001; and 
= 1,000,000. Q.E.D. 
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The Economics of Copying 


William R. Johnson 

f-'w/i'mi/v of Virginia 


Recent le<Imolognal ;iclvances that enable consumers to copy cre¬ 
ative works without compensating their owners have led to proposals 
tor restrictions on copying. To analyze such restrictions, this paper 
considers two models of copying. The first model emphasizes the 
household production aspect of copying with costs differing across 
consumeis, while the second relies on the fixed cost of copying tech¬ 
nologies In both models, a case can he made for restricting copying 
even m the shoi t run if copying induces a large reduction in demand 
lot originals relative to its effect on total consumption. The long-run 
c ase for leslncuon hinges additionally on the elastic ity of supply and 
the value consumeis place on product variety. 


I. Introduction 

Recent technological changes have made it practical for consumers to 
obtain high-quality copies of original creative works at low cost; these 
techniques include the dry paper copier and sophisticated audio and 
video tape recorders. The use of these copying technologies has 
spawned a number of lawsuits by the owners of the original works 
being copied 1 and has led to proposals for a tax on or even prohibi¬ 
tion of copying. 2 Indeed, such a fee is now imposed, at least nomi- 


I appieiiale helpful comments tiom Edgar Browning, Maxim Engers, Scott Masten, 
Ron Michener, David Mills. Roger Shciman, George Sligler. and an anonymous ref¬ 
eree on earlici drafts. Support from the Center for Advanced Studies at the University 
of Virginia is gratefully acknowledged. 

1 Sony C'.orp. of America v. Univeisal City Studios, 659 F.2d 963 (9th Cir 1981) 
Williams and Wilkins Co. v. U.S.. 487 F 2d 1372 (1973) 

2 For example. Senator Mathias in the 97th Congress proposed a copyright royalty 
for home video recorders and blank tapes (Amendment ]333 to SR. 1758) (U.S. 
Senate Committee on the Judiciary 1982, p. 335). 
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nally, on copiers of many scholarly journal articles. This paper exam¬ 
ines the efficiency and distributional aspects of copying for personal 
use in the context of two alternative models of copying as well as the 
possibility of welfare improvements through public policy (restric¬ 
tions on copying or subsidies to the producers of creative works). 

Although the analysis focuses on copying, the problem is a more 
general second-best question that has other interesting applications. 
The essence of the problem is that consumers have access to close 
substitutes for the output of a decreasing average cost firm that prices 
above marginal cost. For example, if consumers can generate their 
own electricity as an alternative to commercial power priced above 
marginal cost, should they be allowed to sell excess power back into 
the system? Another application arises with toll-free alternatives to a 
toll highway or bridge that prices above marginal cost. 

Two basic models ot copying are used to analyze policy alternatives. 
The first model emphasizes copying as household production in 
which the cost of copying differs across individuals. In the second 
model, copiers must incur a large fixed cost (e.g., purchase of a video 
recorder) in order to copy. With the additional assumptions discussed 
below, both models explain why some consumers copy while others 
do not, and both imply that unrestricted copying might be socially 
inefficient in the short run. 


Literature 

The economic literature most relevant to the problem of copying is 
the theory of optimal patents. The optimal patent balances the distor¬ 
tion inherent in monopoly use of the patented process (price higher 
than marginal costs of production) against the incentive to innovate 
offered by property rights in information. In these models, patents 
aie optimal only in the long run when the supply of innovations can 
vary (see Nordhaus 1969; Scherer 1972; Tandon 1982). Like the 
patent models, economic analyses of copyright have also focused on 
the short-run welfare gain from imitators compared with the long- 
run discouragement of creative activity but have not constructed an 
explicit model of copying (see Breyer 1970; Braunstein, Fischer, Or- 
dover, and Baumoi 1979). 

Various other contributions to the theory of the second best are also 
■ elated to this analysis. For example, Bhagwati and Hansen’s (1973) 
analysis of smuggling in international trade shows that smuggling to 
eva e tariff barriers can improve the allocation of resources even if 
smuggling uses more resources than legal trade. Here smuggling is 
l ° COpyin 8 in the household production model. Bhagwati 
) has generalized this idea in his taxonomy of “directly unpro- 
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ductive activities.” Of course, the possible efficiency gain when one 
distortion countervails another is a familiar proposition from the 
theory of second best. 

Ordover and Willig’s (1978) model of the use of academic journals 
(personal subscription vs. library use) bears a close relationship to 
copying, since library use, like copying, is a way of consuming a cre¬ 
ative work without compensating its owner. However, their model 
allows journal publishers to discriminate in price between libraries 
and personal subscribers, thereby capturing in revenue some of the 
surplus generated by library use. Ordover and Willig’s result that 
library user fees would be optimal is clearly analogous to an optimal 
copying tax but depends critically on journal prices being Ramsey 
optimal to begin w r ith. In the model presented here, price discrimina¬ 
tion is not possible and prices are profit maximizing, not Ramsey 
optimal. Both assumptions are more realistic for the broad spectrum 
of creative works that can be copied (books, records, etc.). 

A model of copying must incorporate both the long-run supply 
aspects of the patent models and the greater social marginal cost of 
copying, as in smuggling or tax evasion models, yet without the lim¬ 
iting assumptions of the Ordover-Willig analysis. A recent paper by 
Novos and Waldman (1984) does treat both the long-run supply 
problem and the higher social marginal cost of copying and raises the 
possibility that social welfare could rise with increased copyright pro¬ 
tection even in the short run. However, the Novos-Waldman model 
does not consider consumers with different tastes or the product 
variety issues that are important in markets prone to copying. This 
paper, on the other hand, explicitly models consumer demand for 
variety in a multifirm market, treats two distinct copying technologies, 
distinguishes short-run and long-run effects, and analyzes a tax on 
copying. 


Assumption .1 and Results Summarized 

To clarify nomenclature, let "original” reler to authorized versions of 
the creative work such as books, records, films, TV shows, and so on. 
The sale of originals at a uniform price is controlled by the profit- 
maximizing owner of the master work (the firm). “Copies” then refer 
to unauthorized versions of the creative work. At first, copies are 
assumed to be equivalent in consumption to originals. The indirect 
effect of copy demand on original demand is also neglected for sim¬ 
plicity; although one clearly needs an original in order to copy, the 
extra revenue generated by such demand is likely to be small unless 
price discrimination is possible. 
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Creative works are supplied in the long run along an upward- 
sloping supply curve, while originals are produced at zero marginal 
cost. The zero-marginal-cost assumption merely facilitates the analy¬ 
sis; what is crucial is that the marginal cost of originals does not 
exceed the marginal cost of copies, which is reasonable since original 
producers would use the copy technology if it were cheaper. There 
are many sellers of originals, each with the monopoly power that 
stems from the fact that his work is not a perfect substitute for the 
others. Finally, individual preferences vary but market demand 
curves for each work are identical. The two models differ in their 
assumptions about copying costs. In the household production mod¬ 
els, hxed costs of copying are zero and marginal costs vary across 
consumers. In the fixed cost model, copies can be made at zero mar¬ 
ginal cost once a fixed cost (constant across consumers) is incurred. 

The effect of unlimited copying in both models is to redistribute 
income away from the owners of creative works, although the effect 
on the price of originals (in the household production model) and on 
social welfare in both models is ambiguous. The ambiguity about 
welfare effects is clear in the long run, where the greater consump¬ 
tion of works through copying is balanced by the reduced incentive to 
produce new works. However, even in the short run, an efficiency 
case for a tax on copies can be made because copies use more re- 
sou ices (on the margin) than do originals. As will be made clear, this 
lollows because the private cost of a copy and an original must be 
equal for the marginal copier, but the social cost of an original is less 
than its private cost. The welfare effects of copying depend on the 
circumstances; no unambiguous result applies universally. 

The paper is organized as follows. Section II presents the basic 
model without copying. Sections III and IV examine the household 
production and fixed cost views of copying, respectively. Section V 
allows copies to be imperfect substitutes for originals, while Section 
VI considers some policy alternatives. 

II. The Basic Model without Copying 

Before considering copying, we present a simple model that captures 
the essence of the market for originals. Following earlier models of 
product differentiation like Salop (1979), the diversity of creative 
works is represented by location around a circle; each potential pro¬ 
ducer can locate where he wishes choosing from an infinite number of 

In that respect, the analysts resembles chat of Baumol and Ordover (1977). who 
look at optimal public goods pricing when exclusion is possible. 
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locations. For simplicity, location, x, is defined as the fraction of the 
circumference of the unit circle that x lies from the point (1,0). Thusx 
6 [0, 1); x = .5 is halfway around the circle, and so forth. The circular 
assumption means that linv_i x = 0. 

A continuum of consumers is spread unif ormly around the circle, 
and population size is not varied in the analysis. A consumer’s loca¬ 
tion represents his tastes, and we assume that a consumer is more 
likely to consume creative works that are closer to his tastes. 
Specifically, consumer i consumes work j if and only if 

P-, 35 P: + (1) 

where p, is an individual-specific parameter representing willingness 
(o pay, p, is the price of work j, and i|i, ; denotes the distance between 
the consumer and the work. Thus consumers purchase all works 
whose price plus distance is less than willingness to pay, implying that 
the numbers or locations of other firms do not directly affect the 
demand for or surplus generated by any particular work, an enor¬ 
mously simplifying aspect of the model. 

To illustrate, figure 1 shows consumer i located at x = V-t and firm j 
at x = ’/a. Consumer i will purchase from j if and only if j charges a p, 
s p, - V-\. If all firms price at p r then consumer i will purchase from 
any and all firms closer than p, - p t (as shown by the arrows inside the 
circle). 

If the densities of p at any location are identical and denoted by 
g(p), then the demand for any particular work is given by 

2 |/[1 “ G(p -MdWiK (2) 



Fir- 1. —Locations of firms and consumers 
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since G(p + 4<) denotes the fraction of consumers at a distance 4* for 
whom p. < p + t|c The firm’s profit-maximizing price is derived by 
differentiating (2) and equating to marginal cost, zero. 

In the long run, the number of works, N, is determined by an 
upward-sloping supply curve A^(I1), where II is the profit each owner 
will make selling originals. Under the assumed demand structure, n 
does not depend directly on N or on the location of any works, so 
strategic interactions among firms can be ignored. Thus, works are 
far from being perfect substitutes for each other; implicitly, consum¬ 
ers have a high demand for product variety. 4 

To get some specific results from this model, consider a specific 
example that will be referred to throughout the paper. Let p. be 
distributed uniformly over the interval [0, a], where a =£ V 2 . Using (2), 
it is easy to compute that the demand for originals produced at any 
one location is (a — p) 2 /a, leading the firm to choose a price, a/3, that 
yields revenues 4a 2 /27. 

Consider the consumer surplus generated by a firm selling at price 
p. A consumer at a distance t|i with willingness to pay p receives 
surplus p. - it* - p if he buys (i.e., if p. > 4* + p)- f aking all possible 
distances into account (0 =s tfi =s '/•>) and the density of p at each 
distance, g(p), we have consumer's surplus given by 



(P ~ P ~ i|>)g(p)dprf»K 


(.3) 


In our specific example, (3) becomes 8a 2 /81. 

The welfare aspects of equilibrium in this market are as follows. 
First, given N firms, price should equal marginal cost, zero, to max¬ 
imize social surplus; thus, some consumers are inefficiently excluded 
from consumption. Second, since a firm’s revenues are less than its 
contribution to surplus, too few firms enter the market (i.e., too few 
creative works are produced). This last result arises from imperfect 
substitutability; in Demsetz’s (1970) model, TV shows are perfect sub¬ 
stitutes for each other and hence are not undersupplied, because the 
marginal supplier creates no more surplus than he receives in reve¬ 
nue. In our model, the value consumers place on product variety 
means that the marginal firm cannot capture in revenue what it 
creates in surplus. 


’ This assumption is made for analytic convenience only. The models ol Dixii and 
Stiglitz (1977) and Salop (1979) depict the more realistic case of imperfect substitutabil¬ 
ity among works but would be impossibly complex to use to analyze copying. Demsetz's 
(1970) model of the “private production ol public goods" assumes perfect substitutabil¬ 
ity and is therefore inappropriate for this paper. 
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III. The Household Production Model of 
Copying 

The Short Run 

A sensible model of copying must be consistent with the fact that 
some consumers pay for the original while others make copies. Indi¬ 
viduals are assumed to differ with respect to their costs of making a 
copy; in particular suppose that time costs are most important in 
copying. 1 Then, choosing units of time so that one copy can be made 
in one time unit, the cost at an unauthorized copy for any person is w , 
his wage rate per time unit. We suppress other costs for simplicity and 
assume that copies are equivalent in consumers' eyes to originals. The 
line w = p divides the population into two groups; those for whom w 
< p will prefer to copy while high-wage consumers (u< > p) will buy 
originals, if they consume the work at all. The population of consum¬ 
ers at any distance from the firm varies along two dimensions, |jl and 
w, and tan be divided into three groups: those who buy originals (w > 
p and p > 1 J 1 + /;), those who copy (w < p and p > t|i + w), and those 
who do neither (p < min(je + cj», p + i^l). Figure 2 illustrates the 
possibilities for particular values of p and iji. Areas A + B correspond 
to buyers of originals, C + D to copiers, and E to nonconsumers. 

Revenues for the producers of originals change with p along two 
margins: higher p induces some buyers to copy and some buyers not 
to (onstime. Letting g(p, w) be the joint density of p and w, revenue 
for originals is given by 

2/7 [ [ [ g(p. w)dp.dui(h |>, (4) 

J0 If, J/1+ xj, 

while demand for copies is 

2 f [ [ g(p, w)d\ndwd<if. (5) 

Jtl JuJwKp 

Price, p, is set by the firm to maximize (4). 

Although it would seem intuitive that copying reduces the price 
charged for originals, it turns out that no such proposition can be 
proved for the general form of g(p, w). At any price, copying reduces 
the demand for originals but may increase the slope of the demand 
curve enough to reduce elasticity and marginal revenue. The monop¬ 
olist’s price could rise, but, of course, total revenue must be reduced 
by copying. 

To derive more concrete results, take g(p, w) to be uniform over 0 


5 One could invoke other costs of copying that vary across individuals (access 10 
originals to copy, feelings of guilt about copying) without materially affecting the 
model. 
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Fk;. 2.—Distribution of consumers by p. and w 


« p « a, 0 « u) a, where a € V-i. Using (4), it is easy to show that the 
demand for originals is (a - p)*/a 2 and that the monopolist’s price is 
al 4. Recalling that price was a/3 in this example without copying, we 
see that copying reduces the price, consumption, and revenue of 
or iginals. Of course, because of copying more consumers than before 
enjoy the work and consumers’ surplus will rise. The components of 
this additional consumers’ surplus can be seen in figure 2, where/?' 
denotes the price of originals with no copying. The advent of copying 
reduces the price for those who continue to buy originals (area A) and 
allows some new consumers to buy originals (area B). Copiers, includ¬ 
ing former original buyers (area C) and new consumers (area D), are 
also better off. Those who do not consume (area E) are no worse off. 

In our specific example, consumers' surplus rises from ,0987a 2 to 
1621a 2 when copying is allowed (see App. for details). Because reve¬ 
nue lulls, social surplus rises only 8 percent while consumers’ surplus 
increases 64 percent. 

Copying allows some additional consumers to enjoy the work, each 
contributing a positive amount to social surplus. However, some for¬ 
mer purchasers now copy reducing their contribution to social sur¬ 
plus from M- — t|» to p, — i|» — iv for a net loss of — w. Social surplus is 
enhanced by copying only if the demand-enhancing effect outweighs 
the demand-switching effect, or in terms of figure 2, if the population 
is not overly concentrated in area C. 


The Long Run 

1 hus far we have examined the welfare effect of copying, holding the 
number of firms (works) constant. In the long run, the reduction in 
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revenues induced by copying will diminish the supply of creative 
works along the supply curve, Nffl). In this model, both revenues and 
consumers’ surplus per work are independent of the number of 
works, so total social surplus is simply the product of the number of 
works and the social surplus generated by each work. Copying could 
reduce total social surplus even when social surplus per work rose if 
the reduction in N were large enough. A large elasticity of supply 
makes this outcome more likely. Moreover, total consumers' surplus 
might fall in the long run, even though consumers’ surplus per work 
must rise with copying. Adverse long-run consequences are again 
more likely the greater the elasticity of supply. In our example, social 
surplus falls with copying if the supply elasticity exceeds 0.23, while 
consumers’ surplus falls if this elasticity exceeds 1.45 (see App.). 

It is clear that these long-run conclusions are a consequence of the 
imperfect substitutability of creative works in consumption. If all 
works were perfect substitutes, then there would be no adverse wel¬ 
fare effects stemming from the decline in A' and copying would neces¬ 
sarily make consumers better off even in the long run. In general, 
then, the greater the value consumers place on product variety, the 
more likely are harmful ef fects of the reduction of N in the long run. 

To summarize the ef fects of copying in the household production 
model, copying must enhance consumers’ surplus per work and re¬ 
duce firm revenues but may also reduce social surplus per work if 
copiers are “mostly” former buyers of originals rather than former 
nonconsumers. In the long run, consumers' surplus can fall as reve¬ 
nue losses reduce the number of works. This is clearly more likely the 
greater are both the supply elasticity and the value consumers place 
on product variety. 


IV. The Fixed Cost Model 

The Short Run 

An alternative view of copying emphasizes the investment in costly 
equipment necessary for copying. To simplify, the marginal cost of 
copying is now taken to be zero. In order to explain variation in 
copying across the population, differences in tastes must be invoked; 
consumers with low demands will not find it economical to pay the 
fixed cost of copying technology. An interesting feature of this model 
is that demand for any particular work is affected indirectly by the 
prices of other works since they affect a consumer’s decision to invest 
in the copying technology. 

Let us begin with the consumer’s decision to copy. Purchase of copy 
technology at fixed cost F allows a consumer to copy at no further 
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cost. 6 A copying consumer will, therefore, copy works up to a distance 
of p. away from him. If N firms are evenly spaced around the circle, 7 a 
copier will consume 2iVfi works for a total surplus of ft/2 per work 
consumed on average, yielding the consumer (ft/2)2A'|t = Aft, 2 . 
Hence a consumer’s net surplus, if he copies, is N ft 2 - F. On the 
other hand, a consumer without copying technology pays p per work 
and consumes up to a distance of ft - p for a net surplus of N(\i - 
pf. Therefore a consumer copies if and only if/Vft 2 - F >N(ft - p) 2 , 
or if and only if 


P> 


F + Np 2 
2 pN 


( 6 ) 


As expected, a consumer is more likely to copy the greater his de¬ 
mand, ft, the greater are the prices of originals, the larger the number 
of works, and the smaller is F. 

Again invoking the assumption that ft is identically distributed at 
each location, copying removes the same fraction of high ft consum¬ 
ers from the market for originals at each location. The effect on the 
demand for the output of any one firm can be analyzed as follows. At 
a distance of t|t, consumers demand originals if and only if p + \|t < p. 
< ( FI2Np) + (p/2), where p denotes the common price of all works 
and p denotes the price charged by the firm under consideration. In 
effect, we assume there are enough works that the copy decision is 
independent of the price charged by any one of them. Hence changes 
in the firm’s own price, p, affect only the margin between original 
buyers and nonconsumers (i.e., |t § p + iji) and not the margin be¬ 
tween copiers and original buyers (ft § [FI2Np] + [pi 2]). Thus, we 
can derive a demand curve for a work that depends on both p and p. 

The demand for originals is then: 
r e<?) -p 

2j o {G[8(£)] - G(p + «|>)}t/t|t, (7) 


where G is the cumulative distribution function for ft at each location 
and 0(/») = (FI2Np) + (p/2). In our specific example, (7) becomes 
(l/a)[0(/>) — />] 2 and the revenue-maximizing/) is VuQ(p). 

Since all firms are alike, all face the same demands, so a symmetric 
N ash eq uilibrium occurs where p = p. In our example, this yields/? = 
V FUN. Note that demand depends indirectly on the number of 
firms, because as N rises the number of copiers rises and the demand 


II the copy technology is also used for activities aside from unauthorized copying, F 
should be interpreted as the excess of the machine’s cost above the surplus it generates 
in these other uses. 

When firms are not evenly spaced the analysis is still correct if interpreted as 
applying to the average consumer. 
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for the output of any particular firm falls. Somewhat paradoxically, 
collusion among all firms to charge lower prices might increase the 
profits of all firms since in a competitive equilibrium each firm does 
not account for its effect in reducing the demand for other firms by 
inducing consumers to invest in the copy technology. 

Consider the welfare effects of copying in this model. First, prices 
fall because firms face a lower and a more elastic demand curve. 
Hence, consumers’ surplus per work must rise both because prices fall 
(making original buyers better off) and because, by revealed prefer¬ 
ence, copiers are better off. Revenues per work, however, must fall, 
and conceivably social surplus might decrease. This can be illustrated 
graphically in figure 3, which depicts the linear demand for a person 
with p. = p. t ; when p = p,, no originals are consumed while when p = 
0, this consumer demands works up to pj distance away; hence 2\i\N 
works. 

Without the possibility of copying, price is p' and social surplus 
comprises areas A + B + C. With copying, price fails to p and social 
surplus rises by D + E if this consumer does not copy. But for a copier 
the change in social surplus is /) + E + G — F. where F is, again, the 
fixed cost of copy technology.” One copies, however, only if C + E + 
G > F so that if 

C + E>F-G>D + E, (8) 

this consumer reduces social surplus when copying is allowed. 

In (8), since D, E, and G are constant for all values of p, (as long as p 
> p ) and F, the possible outcomes are described below. 


"Copy technology is assumed to be produced in a competitive market, so F is the 
social marginal cost. 
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1 . if D + E + G > F, then all consumers add to social surplus 
(because the increase in consumers' surplus exceeds loss of reve¬ 
nue to owners of originals). 

2. If F > D + E + G, then 

a) low p consumers who do not copy will still contribute to social 
surplus, but 

b) high p consumers (copiers) will reduce social surplus (since rev¬ 
enue loss exceeds gain in consumers’ surplus). 

The possibility that copying leads to a welfare loss in the short run 
arises if F is large enough and if there are enough high p consumers 
who become copiers. As in the household production model, welfare 
gains are greater the more demand expansion there is relative to 
demand switching, since the former rises with areas D, E, and G while 
the latter depends on area C and the fixed cost, F. 

An obvious application of the fixed cost model is the taping of 
programs from commercial television broadcasts. The analysis is com¬ 
plicated by the f act that the exchange between the owner of the work 
and the consumer is not one of money but rather exposure to the 
advertising message. What the consumer suffers is the disutility of 
having the message interrupt his program while the seller gains an 
audience he can sell to the advertiser. To the extent that the advertis¬ 
ing messages are avoided in copying, this exchange no longer takes 
place. Empirical evidence on this issue is scant, but it is certain that 
many commercials are avoided by copying. A further complication 
arises from the fact that the disutility of the commercial may exceed 
its revenue value to the broadcaster so that some viewers would be 
willing to pay more than their demographic contribution to advertis¬ 
ing revenues to avoid commercials. In terms of figure 3, area C (the 
goods equivalent of the disutility avoided by skipping the commercials 
on the programming watched) would exceed the revenue loss to 
broadcasters, and the gain in social surplus caused by copying is 
gteater than D + E + G—F by that excess. 


The Long Run 

1 he analysis of the long run in the fixed cost model is similar to that in 
the household production model. Since total consumers’ surplus is 
simply the product of the number of works and surplus per work 
(independent of /V), copying can clearly reduce consumers’ surplus if 
the revenue decrease induces a great enough fall in N to offset any 
rise in surplus per work. Since revenue losses are due to demand 
switching while surplus gains come from demand expansion, the con- 
itions for copying to reduce consumer welfare in the long run are 
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little demand enhancement relative to demand switching and a high 
elasticity of supply. 

Our numerical example illustrates the point. When consumers can¬ 
not copy, each firm charges p = 14, generating consumers’ surplus = 
Vm and revenue = s /ni. If there are 25 firms, total social surplus is 
1.5432. 

Suppose copy technology is available nlf = s /4. Then, 60 percent of 
consumers buy it (0.2 ^ p, < 0.5), p falls to 0.1, revenue per firm drops 
to 0.002, but consumers’ surplus per firm rises to 0.06066 (see App.). 
Total social surplus is 1,5665 if N remains at 25. But since revenue has 
fallen, the number of firms will shrink in the long run. For example, if 
the supply elasticity were 0.5 and the market had been in long-run 
equilibrium before copying was introduced, then copying reduces N 
to 11, p to 0.151, and revenue per firm to 0.007. Note that even 
though each firm’s demand is independent of other firms’, the decline 
in ;V reduces the incidence of copying, raising the demand for origi¬ 
nals and permitting a higher price. Total consumers’ surplus is now 
0.4437 compared with 0.6172 without copying. Thus, even though 
per firm consumers’ surplus rises with copying, the number of firms 
drops so much that consumers in general are worse off with copying. 
The socially optimal number of firms (equating social surplus to the 
opportunity cost of the marginal firm) is 32 in this example, which is 
above both the copying and copy-free equilibria. Allowing copying, 
however, greatly exacerbates the divergence between the actual and 
optimal number of firms. Of course, if creative works were closer 
substitutes than they are presumed to be in this model, the welfare 
losses from the reduction in N would be less severe. 

The results of the fixed cost model parallel those of the household 
production model. The short-run gain in social surplus depends on 
the extent of demand switching versus demand enhancement. Long- 
run effects depend, in addition, on the value of variety and the elastic¬ 
ity of supply. 

V. Copies as Imperfect Substitutes 

We now examine the ef fect of dropping the simplifying assumption 
that copies are equivalent to originals in consumption value. In gen¬ 
eral, the presumption is that copies are inferior to originals, though 
that is not necessary for the following analysis. An easy way to model 
this is to let willingness to pay for copies be some fraction, 5, of the 
willingness to pay for the same original. A consumer who would pay g. 
— v|i for an original at a distance »|) (see eq. [ 1]) will pay only S(p — i(i) 
for a copy of that work. 
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The effect of this change is particularly easy to understand in the 
fixed cost model. A consumer who invests in copy technology gener¬ 
ates for himself surplus equal to N 8 2 p. 2 — F, so the condition for 
copying (corresponding to [6] above) implies that fewer consumers 
copy and that the surplus generated by copiers falls as 8 falls. In other 
words, making copies more imperfect (lower 8) has exactly the same 
effect as an increase in F. Interestingly, such a change could (as we 
have seen) make consumers as a whole better off in the long run if the 
resulting decline in copying increased the number of firms 
sufficiently. 


VI. Policies toward Copying 

We have seen that unrestricted copying may make consumers worse 
off. More generally, even if consumers are better off with unre¬ 
stricted copying than with no copying, some government intervention 
might increase social welfare. Two possible approaches are taxes on 
copying or subsidies for the producers of creative works. Both have 
administrative and enforcement problems. For example, the scheme 
established for photocopiers to compensate copyright owners appears 
to be unable to cover its own transactions costs.” A tax on the act of 
copying would be nearly impossible to enforce, so most proposals 
have suggested a tax on the purchase of goods complementary to 
copying (video or audio recorders, blank tapes, copier ink). An inevi¬ 
table inefficiency arises when these taxed goods are not used lor 
copying of creative works. 

To illustrate the effects of a tax on copying consider the household 
production model of copying in our specific example. Figure 4 shows 
the effect of imposing a specific tax on copying with the number of 
firms held constant. Since the price of originals would be '/e with no 
copying, any tax 2= '/e eliminates copying altogether. At the other 
extreme, a zero tax reduces the price to '/«. In the short run, consum¬ 
ers are clearly better off with unlimited copying, but social surplus is 
maximized when the tax = 0.09. Revenues (including lax revenue) 
rise by 33 percent with a tax = 0.09, while consumers’ surplus falls by 
about 17 percent. Thus, if the elasticity of supply were greater than 
about 0.5 and tax revenue were rebated to the owners of creative 
works, 10 consumers would be better off with the tax in the long run. 

See Leibowitz (1983) for a discussion ot the troubles of the Copyright Clearance 
Center. 

Since all firms are identical in this model, an equal division of revenues is appropri¬ 
ate. If firms differed, presumably revenues would be divided in proportion to each 
nrm s sale ol originals. 
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Note that this tax is a large fraction (68 percent) of the market price of 
originals. A similar case can he made in the fixed cost model for a tax 
on copying technology since it has already been argued that an in¬ 
crease in /•' could make consumers better off in the long run. 

An alternative policy subsidizes the production of originals, thereby 
bringing the price faced by consumers closer to social marginal cost 
while increasing revenue to firms. One virtue of this scheme is its ease 
of administration and enforcement compared with a copy tax, though 
the social cost of the public funds required must be accounted for in 
arty final assessment. Also, a subsidy might be seen as horizontally 
inequitable since low demanders will be subsidizing the consumption 
of high demanders. 

VII, Conclusion 

Using tw r o plausible models of copying, this paper has advanced the 
possibilities that unlimited copying reduces social welfare and that 
restrictions on copying may enhance social surplus. Such possibilities 
were shown to depend on (1) the degree to which copying reduces the 
demand for originals as opposed to increasing total consumption, (2) 
the elasticity of supply of creative works, and (3) the value consumers 
place on product variety. Some attempt to measure these three factors 
empirically would seem to he the next item on the research agenda. 


Appendix 

Consumers’ surplus (CS) generated by one firm in the household production 
model is the sum of that due to buyers of originals and that due to copiers: 


Ml 

“cc 

2 


.0 . 

P ■ 


(p - p - t|t)g(p, w)d\tdwd\ii 


+ 2 


p 

-0 


(p — i|) — tc)g(p, w)d\Ldwdty. 
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In the specific example, g(n, w) = 1 hi 1 for 0 =£ g. *s a, 0 =£ w =£ a, 0 =£ a =S '/a. 
Expression (Al) becomes 

CS = -4 j () _ /»)(“ ~ p ~ >!<) 2 + (a - ty)p(a - >1 > ~ p) + yjity. (A2) 


Substituting p 


a/4 and integrating, we have 


C.S = 


249a 2 
1,536 


,1621a 2 . 


(A3) 


Revenue can be found by using equation (4) of the text. 

In the long-run problem, total consumers' surplus is N ■ CS, while N = Aft’ 1 , 
where ft is per firm revenues, i) is the elasticity of supply, and A is a constant, 
l.etting primes denote the copying regime, consumers’ surplus falls when 
copying is allowed if N ■ CS > N' ■ CS' or, taking logs and rearranging, 


log CS' - log CS 
log ft — log ft' 


(A4) 


The right-band side of (A4) = 1.45 in our numerical example. Similar calcu¬ 
lations reveal the q at which social surplus (ft + CS) falls with copying (q = 
0.23). 

In the fixed cost model when F = Yi, a = 0.5, and N = 25, p = VFI 3/V = 
0.1. Equation (6) reveals that consumers with p > 0.2 copy, while equation (7) 
shows that the demand for originals is 0.02 yielding revenues ol 0.002. Re¬ 
ferring to figure 3, surplus for a copier is Np 2 - F while original buyers 
receive iV(|Ji - p ) 2 . Integrating over all copiers (0.2 p « 0.5) and original 
buyers (0.1 « p 0.2) gives 


i; 


2 (iVp. 2 - F)d\i. + 2 iV(p - pydv, = 1.5 + .0166. 


(A5) 
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Money Is What Money Does: Monetary 
Aggregation and the Equation of Exchange 


Paul A. Spindt 

Board of Governors, Federal Reserve System 


This paper develops a method for distilling the transactions role of 
the array of monetary assets available to the public into a single 
aggregate measure of effective transactions balances. The result is a 
monetary quantity index number, designated My, which corre¬ 
sponds closely with the monetary aggregate contemplated in the 
equation of exchange, MV = PQ. In general, Mq and the conven¬ 
tional aggregate M 1 exhibit strikingly similar behavior. However, 
during episodes when the behavior of Ml is “abnormal" relative to 
income and interest rates, Mq behaves differently from Ml. Dtning 
these periods, anomalies in the behavior of Mq are not detectable. 


Although money has many close substitutes as a store of 
value, not even the nearest of near moneys shares with it 
the simple but momentous characteristic of routine ex¬ 
change and circulation. [L. B. Yeager 1968] 

Aggregation over a multiplicity of money commodities, each of which 
circulates to some extent as means of payment, is an important practi¬ 
cal problem that has not yet been addressed in the monetary litera- 


The analysis and conclusions set forth m this paper aie mine and do not necessarily 
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Moore, P A V. B. Swamy, and members of various workshops at which the paper has 
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whose innovative work in developing the Divisia monetary aggtegates pioneered the 
application ot index numbers to monetary data. Excellent research assistance was pro¬ 
vided by Clifton Wilson and Bruce Gilsen. Sharon Sherbert and Susan Eubank pro¬ 
vided patient and expert word processing. 
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ture emphasizing the medium-of-exchange f unction of money. 1 Prior 
to the 1970s, the narrow aggregate M1 seemed adequate, but recently 
the task of identifying an empirical measure of the money stock that 
corresponds closely with the transactions aggregate contemplated in 
this branch of the literature has been seriously complicated by the 
introduction of hybrid transactions/investment assets. The develop¬ 
ment of NOW accounts, money market mutual f und share accounts, 
super NOW accounts, and money market deposit accounts has had 
the effect of blurring the traditionally sharp distinction between 
money and near money, since each of these assets possesses to various 
degrees both means-of-payment (ttansaclions) and investment (port¬ 
folio) attributes. 

The complications caused by these new money assets are due in 
large part to the “all-or-none” convention used in the construction of 
monetary aggregates. Under this convention, a decision is made as to 
the lowest-order aggregate in which to include a given asset. Then 
total dollar holdings of the asset are fully included in that aggregate 
and fully excluded from all lower-order aggregates. But when this 
procedure is applied to hybrid iransactions/investment accounts such 
as NOW.x and money market fund shares, the resulting aggregates 
may be distorted f rom die underlying economic aggregates they seek 
to measure. As Morris (1982), president of the Boston Federal Re¬ 
serve Bank, has pointed our “For example, a percentage (probably 
small) ot money market funds is used as transactions balances and 
ought to be int I tided in the money supply [M1 ], but the great bulk ot 
money market funds are viewed as short-term investments by then- 
owners (and should therefore he counted as near money]. . . . Fur- 
thermoie, money market funds are only one of an array of new 
financ ial instruments, and some unknown part of these new instru¬ 
ments ought to he included in the money supply.”* The same rea¬ 
soning applies to the case of NOW accounts. These balances are fully 
included in the transactions aggregate Ml, but some proportion of 
them are presumably viewed bv their holders as savings balances. 

Sporadic attempts to construct monetary aggregates that circum¬ 
vent these problems by weighting each component according to its 
degree of “moneyness” have been made.' Chetty (1909), for instance, 


1 I lake (his literature 10 include Ixith the mvenloiy theoietic It ansatlions demand 
lor money lueratuie and llie branili ot the monetary lueiaiure concerned with estab¬ 
lishing the new rniciolotindations ol money (see Harm and Fist her 11976), esp. secs. I? 
and 7, and the tefcrences contained therein). 

' Morris is led to the pessimistic conclusion ili.it "we tan no longer measure the 
money supply with any kind of precision " 

1 1 he suggestion to do this was made by Gulley (I960) and again bv Fiiedman anti 
Schwartz (1970) The Federal Reserve's shift adjustment of M1 -B in 19H1 is an example 
ot this strategy See also Tinsley and Garrett's (1978) ptoposal to include a portion of 
repurchase agreements in Ml For a mote unusual approach see Kane (1964). 
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estimates elasticities of substitution among a set of financial assets and 
uses these estimates to construct the “money equivalent” of each of 
the assets. In another approach. Root (1975) applies the statistical 
technique of factor analysis to monetary data and labels the two fac¬ 
tors his procedure uncovers as “moneyness" and “near-moneyness.” 
Factor scores can be used to compute a “money” aggregate. While 
these results are suggestive, they have not found much acceptance 
among monetary economists. Both tactics involve the estimation of 
model parameters and neither is equipped to deal with a changing 
payments mechanism or menu of financial assets. The main problem 
temains that we have been unable to define an acceptable scale of 
moneyness. 

The use of index numbers in the construction of monetary aggre¬ 
gates is a promising alternative to the all-or-none convention, which 
does not require the explicit definition of a scale of moneyness. In an 
approach pioneered by Barnett (1980), monetary assets are viewed a 
la Friedman (1956, para. 14) as durable goods that render to their 
holders a variety of “monetary” services. The service flow of a given 
monetary asset is priced by the asset’s user cost, as in Barnett (1978). 
Standard results from aggregation and index number theory are then 
applied to construct quantity aggregates—the Divisia monetary 
aggregates—that measure the flow of monetary services used by the 
economy’s monetary asset holders. But “money” in this analysis, as in 
Friedman’s (1956) perception, is a capital aggregate, the demand for 
which by wealth holders is a problem in capital theory. In particular, 
the economic focus of the analysis underlying the Divisia aggregates is 
the demand for monetary services, and in this sense, a money good is 
analytically indistinguishable from any other capital good—autos, to 
choose randomly, which are demanded for “automotive” services. 1 

1 his conception of money is insufficiently narrow for the analytical 
and empirical purposes of some monetary economists who, like the 
original quantity theorists, emphasize the primary significance of 
money’s distinctive role as means of payment. For these economists, 
the special role of money goods in the trading process constitutes the 
essential economic property of money, because it results in an impor¬ 
tant asymmetry between money goods and other goods as sources of 
ellective demand. ’ In this paper, an index number solution to the 

In pi acute. monetary services are defined implicitly as whatevei is priced by the 
ust ’ r ‘°*j l» addition to general acceptability as means oi payment—and not all assets 
inc tided in the higher level Divisia aggregates provide this service—monetary services 
may consist in many other things, such as liquidity, portability, divisibility, and surety ot 
nominal value. 

i icikIo' t R ” ^ ovver (1967), Yeager (1968), ((lower and Howitt (1978). Jovanovic 
), oi Hahn (1983). 1 he concentration of the means-of-payment function in at 
!" osl a **]) money goods has been rationalized by Brunner and Meltzer (1971) and 

(ones (1976), among others. 
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problem of measuring aggregate money qua medium of exchange is 
proposed. 

The papei - is organized as follows. The question of monetary aggre¬ 
gation is addressed in Section I by considering the equation of ex¬ 
change for an economy with multiple means-of-payment goods. An 
index number (pair) measure of the money stock and velocity aggre¬ 
gates that appear in this equation is proposed. The empirical con¬ 
struction of these index numbers is discussed in Section II. In Section 
III, the empirical behavior of these aggregates and their relation to 
other economic variables is examined. These relations appear to have 
been reasonably stable over the 1970—83 period. Finally, concluding 
remarks are contained in Section IV. 


I. Monetary Aggregation and the Equation 
of Exchange 

The equation of exchange is a useful point of departure for discuss¬ 
ing monetary aggregation when the narrow means-of-payment view 
of money is to be emphasized. Although it is often treated as a simple 
identity, the equation nevertheless has important analytical content 
bec ause of the twin partitions it imposes on nominal spending. On the 
light (receipts) side of the equation total spending is represented in 
terms of the real quantity flows of the nonmoney goods in the trade 
stj earn and their prices, while on the left (payments) side total spend¬ 
ing is represented in terms of the stocks of money goods and the rates 
at which these stocks circulate in support of trade. Thus, for an econ¬ 
omy with K means-of-payment goods, nij,, k — 1, . . . , K, and N other 
goods, the equation of exchange is written as 

K N 

X «**’* = X Me (!) 

*=i j~ j 

or simply m'v = p'q in the obvious vector notation. The asymmetric 
treatment of money and nonmoney goods in this equation reflects the 
special role as mean of payment assigned to money goods in a mone¬ 
tary economy (see Clower 1967; Friedman 1970; Pesek 1973). 

The question of aggregation arises when it is desired, say for mac- 
roanalytic purposes, to represent equation (1) in the “simplified” form 

MV = PQ, (2) 

where M, V, P , and Q are scalar-valued aggregator functions of the 
2(K + N ) variables in (I). 6 To preserve the analytical premise of the 


11 Depending on how q and v are defined, we may have the income or the total 
transactions version of the equation. Throughout this paper we shall confine our atten¬ 
tion to the income version 
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equation of exchange—that is, the partition of nominal spending into 
money stock and velocity components on the one hand and price and 
quantity flow components on the other—the domains of the ag¬ 
gregator functions in equation (2) must consist solely of the set of 
corresponding variables appearing in (1). That is, for example, Af in 
(2) is the value of the aggregator function M(m):m —* M and V is the 
value of the aggregator function F(v):v —> V? In addition, these ag¬ 
gregator functions must satisfy M(m)V(v) = m 'v and P(p)Q(q) = p'q. 

The problem of finding a pair of aggregator functions Af(m), F(v) 
that satisfy these conditions is discussed in Spindt (1984). In general, 
it is shown there, such a pair of aggregator functions will exist if the 
trading sequence summarized in the equation of exchange (1) is 
efficient in the sense of minimizing trading costs for a given nominal 
volume of trade. Moreover, the functional forms of the money stock 
and velocity aggregators, Af(m) and V(\), are implied by the form of 
the trading cost function together with the condition Af(m)V'(v) = 
m'v. H Thus, Af(m) and F(v) involve the unknown cost parameters of 
the underlying trading technology. 

In the particular solution derived in Spindt (1984), the Fisher ideal 
index number formula is exact for the aggregator functions A/(m) 
and F(v), in the sense that 

M(mi) _ / m'|V 0 mWi \'A 

M(m n ) \ m ( ',v ( ) moV! / 

y(vi) _ / mpV| mjv l \ l /i W 

F(v () ) \ m ( 'jv« mjv () / 

where the subscripts designate time periods. 11 The empirically useful 
implication o( this result is that index number measures of the money 
stock and velocity aggregates in the “simplified" equation of exchange 
(2) can be computed from tnoney-good-holding and turnover rate 
data without having to estimate the functional parameters involved in 
the aggregators M( m) and F(v). Index numbers are, of course, the 
way we conventionally measure the P(p) and (?(q) appearing on the 
I r ‘8^t side of (2). The interpretation of the magnitudes in (2) as index 
numbers was urged forcibly by Warburton (1953). 

It should be noted that this condition is violated For the velocity aggregate corre¬ 
sponding to a conventional summation aggregate such as Ml. In this case, the velocity 
aggregate depends on m as well as v, with the consequence that aggregate velocity may 
c ange without a change in any component of v. 

ese results roughly parallel those of Diewert (1976), which apply to price and 
quantity aggregates-i.e., the right-hand side of eq ( 2 ). 
i e ' envat,on leading to this conclusion is based on the technically convenient, but 
erw,,e lnc °n s equential, choice of a quadratic mean of order 2 functional form to 
the trading cost function. Use of some other flexible form would lead to a 
en (superlative) index number formula, but all the superlative index numbers 
■move very closely together (see Diewert 1976). 
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T wo features of the index number measures (3) are worth high¬ 
lighting. First, the money stock and velocity aggregates in (3) are not 
simply weighted averages of their components along the lines sug¬ 
gested by Friedman and Schwartz (1970). 10 Because they are index 
numbers, they are dimensioned as pure numbers; their levels are 
normalized and have meaning only in comparison with levels in some 
other time period. Second, although turnover rates (money good 
holdings) appear in the index number formula for the money stock 
(velocity) aggregate, movements in the money slock (velocity) aggre¬ 
gate (3) are generated entirely by movements in m (v). That is, the 
money stock measure moves when and only when there is some 
change in m between periods, and the velocity index moves w hen and 
only when there is some change in the vector of turnover rates v . 11 
T hus, the analytical premise of the equation of exchange (1) discussed 
above is preserved by these aggregates. This property is not shared by 
the velocity measure corresponding to a conventionally defined mon¬ 
etary aggregate such as M1 (or a Divisia aggregate); the velocity of Ml 
can change even when all turnover rates are constant. 


II. The Fisher Money Stock Index 

In the preceding section, it was suggested that empirical measures of 
the money stock and velocity aggregates that appear in the equation 
of exchange can be constructed by applying the Fisher ideal index 
number formula to component money-good-stock and turnover rate 
data. To do this, operational definitions of the components of m and 
v must be adopted. T he discussion in this section is devoted to this 
task. 

The view of money as means of payment on which the equation of 
exchange is based provides a relatively clear-cut criterion lor which 
monetary assets to include as components of m. In particular, an asset 
is included if and only if it serves more or less generally as medium of 
exchange. T hus, monetary assets such as short-term repurchase 
agreements and overnight Eurodollar deposits, which are highly liq¬ 
uid but do not serve as means of payment, are excluded, while hybrid 
transactions/investment assets like NOW accounts and money market 
mutual fund shares are included because they are third-party trans- 

111 In paiucidar, the money stock (velocity) aggregate in (3) is not simply a "turnovei 
rate" (“money-holding’’) weighted aggregate ot its components. However, it is possible 
to approximate the growth rate ot the money sunk (velocity) aggregate in (3) as a 
weighted sum of the growth rates ot the component money holdings (turnover rates) 
But in this approximation the weights are shares of nominal spending. 

11 This is clearly a desirable pioperty If li did not hold, for instance, lor P( p) or Q(q) 
on the righl side of the equation, we could have a change in the measured price level 
even though all prices were constant 
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ferable. For purposes of this paper, the following six assets constitute 
the components of m: (1) the currency component of M1 plus travel¬ 
er’s checks, (2) the demand deposit component of Ml, (3) the other 
checkable deposits component of Ml (which includes both ordinary 
and super NOW accounts), (4) money market mutual fund shares net 
of share balances devoted to IRA or Keogh accounts, (5) savings 
accounts subject to telephone transfer (“bill payer” accounts), and (6) 
money market deposit accounts. Stock quantity data on these assets 
are readily available at monthly frequency in the published monetary 
statistics (e.g.. Federal Reserve Board release H.6). 

The components of v are the turnover rates of these assets. There 
are, however, two major measurement problems. First, no turnover 
statistics are collected for currency or traveler’s checks. Second, the 
available turnover statistics for the other components of m are based 
on gross debits. Gross debits include, in addition to debits repre¬ 
senting payment for final product, debits that arise in other product 
account and capital account transactions . 12 The turnover rates 
needed to compute the index numbers (3) are net of this latter type. 

fhe procedures used for overcoming these problems in the mea¬ 
surement of v are discussed below. It will appear that some of the 
assumptions used are uncomfortably ad hoc. In defense it can be said 
that the resulting index number measure of the money stock does not 
seem to be particularly sensitive to variations in these assumptions. 
But in any event, these procedures should be viewed as highly provi¬ 
sional. Further research should provide better measurements. 

An approach suggested by Fisher (1911) is used to determine the 
turnover rate of cash. 11 Observing that most cash is obtained from 


la A rough taxonomy ot transactions is: 

I. Output transactions 

A Final output uansactions 
B Intermediate product transactions 
C Raw materials transactions 

II. Capital transactions 

A, Existing real asset transactions 
B Financial asset transactions 

1. Money asset transactions 

2, Othet financial asset transactions 

dross debus include debits arising in all these categories. Fhe turnover rales iequired 
are net of all debits except those arising as payment for transactions in category IA 
' 1 An alternative method for measuring the turnover rate of currency was suggested 
by I .all rent (1970) Laurent’s ingenious approach exploits information on the life distri¬ 
butions of the various denominations of currency that can be computed from bank note 
redemption data. He assumes that a note is redeemed when and only when it has 
completed G transfers in payment. Given actual redemption rate data and the composi¬ 
tion of currency by denomination, one can then express the turnover rate of the 
currency stock as a function of G (Laurent does not do this directly, however). Laurent's 
empirically selected value of G = 129 gives a cash turnover rate that has ranged 
between about 7 and ID times per month since 1970. A crucial assumption in this 
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banks, as when a check is cashed, and that most cash is ultimately 
redeposited in banks, Fisher proposed to measure total cash pay¬ 
ments by multiplying aggregate cash withdrawals by the number of 
consecutive payments between withdrawal and redeposition, that is, 
the average length of the cash loop. Data on cash withdrawals could 
presumably be collected from banks’ books, but these data are not 
regularly collected and published along with the monetary statistics. 
Data on the average length of the cash loop are more difficult to 
obtain. We do, however, have some useful information. Citing bits of 
survey evidence, Cramer (1980) estimates annual cash withdrawals at 
banks to be about lour times demand deposit balances, and the aver¬ 
age length of the cash loop to be two payments. These numbers imply 
a monthly turnover rate for currency equal to % times the ratio of 
demand balances to the currency stock. A slight modification of this 
estimate is made here to account for the cashing of NOW account 
orders and savings deposit shares. H Gross turnover rates on the other 
components of m are available monthly in the Federal Reserve 
Board’s Release G.fi. 

Turnover rates based on gross debits are not the figures needed for 
(he computation of the money stock and velocity indexes in the equa¬ 
tion of exchange because they reflect payments for transactions that 
are not counted on the right side of the equation. It is useful to sort 
the total debits for each type of monetary account into four catego- 
nes. First, total debits may include debits arising from money¬ 
changing transactions. A debit is recorded, for example, against de¬ 
mand deposits when a check is cashed. Second, total debits may 
include debits arising as payments for transactions on the capital ac¬ 
count as, for instance, when equities or used goods or financial assets 
are traded. This category can be labeled capital account debits to 
distinguish it from payment for current output. Third, total debits 
may include debits arising as payment for raw materials and inter¬ 
mediate output. This category of debits can be labeled product 
(nonfinal) debits. Fourth, there are debits arising as payment for final 
output, which can be labeled as product (final) debits. The total, 
across all types of money assets, of these product (final) debits is just 


procedure is that the number of net cash payments for hnal goods and services made by 
a given note is well represented by the numbet of times the note is physically handled— 
the determining factor in its life. The cash turnover rate determined by this procedure 
is also quite sensitive to choice of G, and yet there does not appear to be any very precise 
way of assigning a value to G. Also, Laurent's method applies only to currency. See 
Cramer's (1980) critique of Laurent's method. 

M The actual formula used is v t , = (2 * + i»i,)/(3 * m w ). where v lt is the gross 

monthly turnover rate for currency, m,„ m 2 „ and are, respectively, stocks of cur¬ 
rency, demand deposits, and other checkable deposits. 
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equal to GNP. The turnover rates required to compute (3) include 
only these debits. Algebraically, what we have said is this: Let D* 
denote gross debits to the Ath component of m. Then 

Z/* ; = iff + Of + iff + iff, (4) 


where the superscripts MC, CA, PNF, and PF mean, respectively, 
money changing, capital account, product (nonfinal), and product 
(final). Also, 

K 

X Bff = GNP. (5) 

k-= | 

Finally, the turnover rates required are 

T* = (6) 

m k 


while the gross turnover rates published are 


u* 



(7) 


The method for estimating the money-changing debts, is pre¬ 
sented in table I. There are three sources of money-changing debits. 
First, money market mutual fund checks are cleared through com¬ 
mercial bank demand accounts. Hence Di' c , where the subscript 2 
indicates demand deposits, includes all debits to money market 
mutual fund shares. Second, the debits to demand deposits (k = 2) 
and to other checkable accounts (k = 3) due to encashments are 
money-changing debits. These have been discussed above. Third, the 
remaining sources of money-changing debits are transfers from one 
account type to another. The assumptions, staled in table 1, used to 
estimate the volume of these transfers are based on preliminary data 
gathered in the Federal Reserve Board’s 1933 Survey of Gonsumer 
Finances. 

To obtain estimates of the turnover rates (6), estimates of capital 
account debits iff and product (nonfinal) debits lff h are needed. 
I he strategy followed for both these categories of debits was first to 
obtain an estimate of their total volume, that is, X* Z/* A and X* lff h , 
and then to estimate how these totals are allocated across the tn*. 

Consider first the capital account debits. Direct measurement of the 
total volume of these debits is difficult because they represent pay¬ 
ment for financial asset transactions as well as transactions on the 
markets for existing real capital and used goods. An easier method, 
and the one used here, is to obtain a measure of total product transac- 
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TABLE 1 

Comi’ui a 1 ion ot Monfy-chanc.ini. Debi ts 


k 

«* 

nl" 

Example: 
December 1983 
($ billions per Month) 

1 

Cum cut v 

Zci 0 * 

0 

2 

Demand deposits 

Sum of 




Debits to money market 

31.0 



mutual tundst 




Encashments^ 

8] 0 



1 lansfers to oitiei checkable 

4 



deposits*) 




Transfers to money maiket 

0 



mutual funds 11 




Tiansfers to money market 

2 



de|M>sit accounts.* 




Total 

1 12 5 



Percentage ot gioss debits to 

1 4 



demand dc|x>stts 


3 

Oihei checkable 

Sum ot 



deposits 

Encashments^ 

21.2 



1 tanslets to money market 

,2 



lunds ® 




total 

21 4 



I’m enlace of gtoss debits to 

12 7 



othei tltetk.ible deposits 


4 

Money tnatkcl 

liausteis lo money market 

.7 


friittii.il funds 

deposit accounts* 




Pen enlace of Stoss debits lo 

2.2 



money market imitual hinds 


:'i 

1 elephonc trans- 

Zero* 

0 


la 1 mifu. 1 t funds 



(i 

Money mat kcl 

Zeio* 

0 


deposit at c minis 




* I lit axsumpiit 111 is t ha 1 llit* solutiit of tiansfcrs I mm dirxc .in minis into mi her mono an minis is mM^iiifu.uit 
I here .in* tin <1.11.1 In supfwiil .ilU'tn.itive assumptions 
tMimr) maikci mutual fund dinks ate cleared ihrnu^li ihr demand accounts o| the funds 
Jf iK.ishiiit'iKS .lie flt’teitinned l>\ the method nl fisher dcsiiibcd in ihe text | best* equal limes the sunk of 
demand deposits and , ti nines the stock til otfiei checkable deposits jh-i month 

Iff quals times am 11 k 1 ease in the sunk of iidirt checkable aicmmis 1 his assumption is based on preliminary 
(fata K^iu'U'd m tin* federal Resets* Hoatti's l‘fN 3 Nurses of Camsumct finances 

1 F.quals *N limes any intica.se 111 the stock of monps maiket mutual funds 1 his assumption is based 011 pielmti- 
nar> data {fathered lit llie fcdcnal Rcservt Hoard s l')H 3 Survey of ( misiuner finances 

*hquals 'A times ans increase 111 the stock of money maiket deposit accounts 1 his assumplion is based mi 
pielimman <lata gathered in die fcdetal Reserve Board's 1 1 >H 3 Nui vey of (.otistimei finances 


tions, £*(//* ,v> + iy' k h ), and then to compute capital account debits as 
the residual 

X o c i A = x - X °y L - X (W + (8) 

h k k A 

The procedure used to measure total product transactions is from 
Cramer (1980), and the monthly construction of this measure is de- 
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scribed in table 2. There remains the question how to allocate this 
total volume of capital account debits across the components of m. 
Preliminary evidence from the Federal Reserve Board’s 1983 Survey 
of Consumer Finances suggests that about 85 percent of total pay¬ 
ments out of money market mutual funds and money market deposit 
accounts are for capital account transactions. This determines f/* 4 for 
ihese two components of m (A = 4, 6). In the absence of any data 
regarding the allocation of the rest of these debits, the remainder is 
allocated to demand deposits (k = 2): Z)f> 4 = £* Z)* 4 - D!{ a - Z)fv 4 . 

With regard to the category of product (nonfinal) debits, the total 
of these debits is measured by deducting CNF from the measure of 
total product debits described in table 2 in the light of (5). l:> As to 
allocating this total across the components of m, it is assumed that the 
composition, by type of means of payment, of payment for final prod¬ 
uct is the same as for total product. This is clearly a heroic assump¬ 
tion, but it or something like it must be made, given the complete 
absence of any data bearing on the question. 

Using these procedures to obtain estimates of/)* 4 and /)*‘ v/ ’, esti¬ 
mates of the final product turnover rates (6) were constructed. These 
figures, along with the data on the components of m, constitute the 
raw data necessary to compute the money stock and velocity index 
numbers proposed in the previous section. For convenience of refer¬ 
ence, I will call these indexes the Fisher money stock and velocity 
indexes and denote them by My and Fy. ,<> Time series of m, v,My, Vq 
are available from the author on request. 

III. Empirical Results 

The behavior of My is depicted in figure 1. Normalized conventional 
M1, Divisia M1, and Divisia L are shown along with My for reference. 
The velocity measures associated with these money slock measures 
are plotted in figure 2. 

Inspection of these figures reveals a striking degree of similarity in 
the behavior of My and conventional Ml. This impression is rein¬ 
forced by the data in tables 3 and 4, which list fourth-quarter-to- 
fourth-quarter growth rates for selected money stock measures and 
corresponding velocities. These data show that in addition to the 
general similarity in the mean growth of My and M1, there is a similar 
degree of volatility as this is measured by the standard deviations of 
the growth rates. 

During two historical episodes, however, the similarity in the behav- 

^ Monthly GNP data are from Corrado (1983). 

The subscript Q identifies these indexes as measures in the income version of the 
equation of exchange. Corresponding indexes, subscripted with a T, can lie defined for 
the total transactions version of the equation. 
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TABLE 3 

Annual Rails of Monlv Grow 1 h* 


Year 

/Wyt 

Ml 

Ml"t 

t"i 

1971 

6.9 

6.7 

7.1 

9.5 

1972 

8.2 

8.5 

8.8 

12.0 

1973 

6.4 

5.8 

6 9 

8.5 

197-1 

5.9 

4.6 

6.2 

6.0 

1975 

5.6 

5.0 

6.1 

9.3 

1976 

t>,3 

6.1 

6.5 

10.2 

1977 

8.2 

8.2 

87 

11 6 

1978 

8 1 

8.2 

9.0 

86 

1979 

6.8 

7.4 

7.4 

2.9 

1980 

6.8 

7.2 

8,0 

4.0 

1981 

1 6 

5.1 (2.3)4 

6.5 

2 6 

1982 

6.8 

8.5 

9.1 

8 5 

1983 

7 3 

9.6 

7.2 

n.a. 

Mean 

6 5 

7.0 

7.5 

7.8 

Siandni d 
deviation 

1.6 

1.5 

l.l 

3 2 

•IVimu.ige 1 

i 

ilungr of lout 1 I 1 qiuitei averse 

* Mq<( m„ v (f ra, , 

of year listed fiom loiiuh 
/ *n',v, ■ 

Wf |V ( _ 

quartet average of previous 
m',v, \ Vi 

1 jv,/ 

year 


*heio v, IS the mini of t^ defined 111 CI| (h) 
fDmsM 

S'lhih-acljusted Ml-M m parentheses 


ior of Mq and Ml is interrupted. During the period 1973-75 and 
especially in 1974, Afy—and the Divisia aggregates—grew at a some¬ 
what faster rate than Ml. Equivalently, during this period growth of 
Vy was weaker than the income velocity of Ml. The two measures 
again diverge in 1979—83 when My growth was somewhat weaker 
than that of Ml. Mq was particularly weak in 1981. In view of the 
overall similarity, it is worth examining these periods of different 
behavior in greater detail. 

I he 1973-75 period is well recognised as having produced abnor¬ 
mally sluggish growth of the money stock relative to growth in money 
income (Enzler, Johnson, and Paulus 1976). It was during this period 
that a downward (upward) shift appeared to occur in the conven¬ 
tional demand for money (velocity) function. A variety of explana¬ 
tions tor this puzzle have been proposed since it was initially observed, 
the most cogent factors appear to be innovations in the financial 
technology and regulatory changes that reduced the transactions 
costs of converting funds between money and other financial assets. If 
these changes in the payments mechanism are the source of the puz¬ 
zle, then the fact that information on money use is “internalized” in 
tile fisher money stock index suggests the possibility that Mq demand 
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TABLE 4 

Annual Rates of Velocity Growih* 


Veat 


M)| 

M l"$ 

L°t 

1971 

2.6 

2 7 

2.3 

.1 

1972 

3 1 

2.8 

2.5 

-.5 

1973 

4 9 

5.5 

4.4 

2.8 

1974 

1.2 

22 

.8 

.9 

1975 

4.2 

4.8 

37 

.6 

1970 

2 9 

2.9 

2.6 

- 1.0 

1977 

3 8 

3.7 

3.2 

.4 

1978 

6 1 

6 0 

5.2 

5.5 

1979 

2 7 

2 1 

2.1 

6.6 

1980 

2 I 

2 0 

1.2 

4.9 

1981 

9 1 

5 4 (8 4)§ 

4.1 

6.7 

1982 

-4 0 

-5.5 

-6.0 

-4.0 

1983 

4 ft 

2.4 

3 1 

n.a 

Mean 

3 4 

28 

2.2 

1.9 

Standaid 

deviation 

2.9 

28 

2 7 

3 8 

* Peri cm.ige i 

t 

.hangr of fourlh <ju.trier average 

l Qt = Fy/(m<. v,. m/- [, 

of year lined fiorn fourlh 

, / mjv, 

v f i) - ' ut- t —, - 

V *n,v, , 

(jtiaricr average of 

ml Its 

previous year 


when v, \s il>r m'i U« nl (lrfmnt in rtj ( 4 >) 

{IVInutt iumputt’ll as r«iiu» n( (.Nl* nvct aggregate listed 
^Shtlt .ulpisird Ml-ft m p.iietulirsi's 


(or, equivalently, Vq velocity) might be better behaved over this period 
than the M1 function. 

To test this conjecture, a simple money-demand (inverse velocity) 
function of the form 

log -j/tq = Po + Pi lo g jrrg + P-' lo g $ + Ps lo 8 RTB 
+ P i log RS + PIX-I 

was estimated for M = Mq, M I, and the Divisia aggregates M1 and L 
using monthly data f rom 1970—73. 17 Here P ■ Q is nominal GNP, RTB 
is the 90-day bill rate, RS is the passbook savings rate, and N is popula¬ 
tion. The equation was then dynamically simulated over 1974-76. 
Postsample money stock forecast errors aggregated to quarterly fre¬ 
quency are presented in table 5. The equation for Mq tends to under¬ 
predict slightly through 1975 and then overpredict 1976 but gener¬ 
ally does reasonably well. Divisia Ml also outperforms Ml during this 
period. Beginning in 1975, growth of Divisia L was considerably 


17 The shortness of the sample span is dictated by the lark of available Mj, data prior 
to 1970. Details of the estimated equation are available from the author on request. 
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TABLE 5 


* 9 * 


Cumulative Tracking Errors eor Money Stock Forecasts: 
Goldfeld-Type Inverse Veloci ty Function 


Date 

Af« 

Ml 

Ml” 

L° 

I974-QJ 

.4 

.0 

.5 

1 

1974-Q2 

.4 

- 1.0 

.1 

.1 

1974-Q3 

.3 

-2.2 

- 1.0 

- 1 

I974-Q4 

.4 

-3.5 

- 1.7 

2 

1975-Q1 

.3 

-4.7 

-2.3 

.7 

1975-Q2 

7 

-4.5 

-2.0 

1 7 

1975-Q3 

1.1 

-4.7 

-2.1 

3.1 

I975-Q4 

.2 

-6 0 

-3.2 

4.3 

I976-Q1 

— .3 

-6.3 

-36 

5.2 

1976-Q2 

- 3 

-6,0 

-3 5 

6 3 

1976-Q3 

-.8 

-6 3 

-3.8 

7 1 

1976-Q4 

- 7 

-6.1 

-3 5 

8.0 


N<»tl —Htimdlum |K*nud monthly 1*170-73 Firms listed .is prurnt.igr- of money stock 


stronger relative to income and interest rates than is consistent with its 
relation to these variables in the early 1970s. When the equations are 
fitted over a longer period (1970-78), a dummy variable for the 
months July 1973 through June 1975 has a significant (negative) coef¬ 
ficient in the Ml—and the Divisia Ml—equation but an insignificant 
coefficient in the Mq equation. 

The behavior of Mq diverges from M1 again in 1979-83, especially 
in 1981 when nationwide NOW accounts were introduced. During 
this period growth of Mq was lower than the growth of Ml. And in 
1981, Ml velocity grew only moderately, but there was a sharp in¬ 
crease in Vy growth. The differences in 1981 may actually be less 
striking than they seem. It was generally anticipated that the growth 
of M1 would for a while overstate the growth of the underlying trans¬ 
actions aggregate of interest to policymakers as the public shifted 
some of its savings balances into the newly authorized NOW accounts 
(Simpson 1980). The shift adjustment of Ml-B was intended to com¬ 
pensate for this effect. Growth of shift-adjusted Ml-B in 1981 is much 
closer to that of Mq. 

f he remainder of this section addresses the question whether the 
recent behavior (since 1979) of the aggregate is consistent with 
historically estimated relationships. Three relationships are exam¬ 
ined: (1) a St. Louis-type reduced-form model of GNP growth, (2) a 
reduced-form model of inflation, and (3) a conventional money- 
demand equation. 

The St. Louis reduced-form model takes money growth as a policy- 
determined exogenous variable and relates the growth of nominal 
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Ml 
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3 07 

93 


(64) 


(.39) 

(70) 

(.71) 

fc'o 
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.71 

11 

-.10 


( 40) 


(.36) 

(39) 

(58) 

M’l 

.04 


01) 

.28 

55 


(.49) 


(.43) 

(.51) 

(.89) 

11*2 

S3 


28 

61 

54 


(.39) 


( 34) 

(37) 

(.60) 

U'q 

00 


00 

00 

00 


(00) 


( 00) 

(.00) 

(00) 

-<» 

- 17 


- 19 

-.09 

00 


( 23) 


( 22) 

(22) 

(25) 

^1 

09 


.08 

.21 

.26 


( 2. r >) 


(.24) 

( 25) 

(.29) 

z>> 

.06 


10 

- 01 

- 02 


(.25) 


(.24) 

( 24) 

( 25) 

-3 

00 


00 

00 

00 


(.00) 


(00) 

( 00) 

(00) 

(> 

- 09 
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- 01 
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( 18) 

( 18) 

H J 

23 


27 

24 

20 

D-W 

1 37 


1 89 

1.93 

1.86 

SK 

3.88 


3.76 

3.87 

3 97 


\oi» - Istiin.ilion (M'lincl 1M7|-Q2 u» l‘>7N-{J4 1 he rstim.uurn |»um r< I tin* assumes the w and i toclhc k nts ,m* 
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GNP lo current and lagged money growth rates and the current and 
lagged values of a fiscal policy variable. In the model due to Tatom 
(1981), a strike variable and the relative price of energy are also 
included. These refinements are ignored here. Table 6 reports esti¬ 
mates of the coefficients in the St. Louis—type equation 

3 _3 

g(GNP), = a 0 + ^ a u g(M), , + ^ a 2 ,g(E) t -„ 

1 = 0 i = 0 

where £ is the St. Louis fiscal variable, M = Afy, Ml, Divisia Ml, and 
Divisia L (all quarterly), and g{-) denotes the annualized growth rate 
of the argument. These equations were estimated for the period 
1971—78 and then dynamically simulated lo obtain GNP forecasts for 
1979—83. The forecast errors from this procedure are tabulated in 
table 7 and summary statistics presented in table 8. It is evident from 
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TABLE 8 

ONP-Livti. Fork am Errors, Summary Statistics: St. Louis Reduced Form 
(Prediction Period' 1979-QI to 1983-Q4) 


Monetary Variable 


Sr MM ARY SlATIS'Iir 

My 

Ml 

Ml" 

L" 

Root mean squared pel tentage erioi 

2 64 

7.39 

7.02 

9 94 

Mean absolute pet tentage error 

2 10 

5.31 

5.22 

8 99 

Mean percentage eiror 

- 63 

-5 31 

- 5.23 

8.99 

Regression inefficient 

88 

.67 

.69 

1.51 

Thetis inequality coefficient 

Fiaction of erior due to. 

01 

04 

.04 

.06 

Bias 

06 

.48 

.52 

.79 

Different variation 

19 

.41 

.39 

.15 

Different covariation 

75 

.11 

09 

.06 


Non - Modtl desiiilx'tl in i.ihli (>, tnois m t.tlilr 7 


these results that ihe tracking characteristics of the equation using My 
are markedly superior to those of the equation using Ml. IH The over¬ 
prediction of GNP in the last half of 1982 by the My equation indi¬ 
cates that Vq may have been weaker in 1982 than the historical rela¬ 
tionship would project. Additional evidence to this effect is contained 
in the results of money-demand equation simulations. 

The pi ice reduced-form model described in table 9 was estimated 
for M = A/y, Ml, and the Divtsia aggregates Ml and L using data 
through 1978-Q4 and then dynamically simulated to obtain inflation 
rate and price-level forecasts for the period 1979-QI to 1983-Q4. The 
forecast errors from this simulation are presented in table 10 and 
summary statistics are given in table 11. Except for the two large 
positive eirors in 1982-Q4 and I983-Q1, the inflation rate forecasts 
generated by the My equation are quite good. In general, the My 
equation tends to underpiedict slightly the price level, but by a much 
smaller margin than does the Ml equation. Note that Divisia Ml 
tracks the price level somewhat better than the other aggregates until 
1982. Compared with the narrow aggregates, the more inclusive Di¬ 
visia L does not do very well in explaining the inflation rate, either 
within sample (as can be seen from table 9) or out of sample, where it 
substantially underpredicts inflation. 


IS Consistent with what has been reported elsewhere (e.g., Barnett, Offenbacher, and 
Spindt 1984), Divisia Ml performs about the same as conventional Ml in this exercise. 
In contrast to the equations using the narrow measures of money, the equation using 
Divisia L un/ferpredicts GNP on average over this period suggesting that the velocity of 
Divisia t. was unexpet tcdly strong. In fart, the character of the velocity of Divisia L 
changes sharply in 1978—coincident with the deregulation of time deposits—as can be 
seen from table 4. 
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Noth—E stimation penod 1972-Q2 to 1978-Q4 In this equation. P is the rnipiu.il GNPdeflalot, PE is the relative 
pn<c of energy, PF is the relative price of fuel, and D 1 and 02 are dummy variables that are "on' from 1972-Q2 to 
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Inflation Rate Forecast Errors, Summary Statistics: 
Price Reduced-Form Model 
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Noil —Model dcstnlied m table 9, errors m table 10 Prcdiilion period I979*QI to 1983*Q4 


Money-demand equations were estimated for Mq, M1, and the Di- 
visia M1 and L aggregates. I'he form of these equations was essen¬ 
tially that of the demand for demand deposits function of the Federal 
Reserve Board Staff’s Monthly Money Market Model (Farr 1981) 
with two exceptions. First, no constraint was placed on the interest 
rate elasticity in the estimation procedure, as is usually done in the 
monthly model. Second, the suggestion of Porter and Offenbacher 
(1982) to include explicitly a measure of the brokerage fee as a deter¬ 
minant of money demand is adopted. 19 The estimated equations are 
presented in table 12. Postsample forecasts were generated dynami¬ 
cally for 1979-83 and forecast errors are tabulated in table 13. Sum¬ 
mary statistics for these errors appear in table 14. 

The tracking characteristics of the equation for Mq are quite good, 
indicating that the recent behavior of Mq has been remarkably consis¬ 
tent with its historical pattern. There is some tendency to underpre¬ 
dict exhibited in 1981, particularly in the first quarter, but the errors 
are still small and the projection returns to the actual track in 1982. 
These results indicate that Fq was slightly stronger in 1981 than is 
consistent with the historical relationship. While the equations for the 
narrow aggregates M1 and Divisia M1 tend to underpredict through 
this period, the Divisia L equation (which has the best within-sample 
fit) overpredicts in each year except 1982. 

It should be stated explicitly that the purpose of examining these 
relationships is simply to determine whether the experimental Fisher 
money stock aggregate Mq behaves in the way that the substantive 
predictions of monetary theory say a narrow (i.e., transactions) money 


l ’’ This suggestion was also made by Enzler et al. (1976). 
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TABLE 13 

Money Stock Forecast Errors: Monthly Money-Demand Equation 
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1982: 
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TABLE 14 

Money Stock Forecast Errors, Summary Statistics: Monthly Money-Demand 
Equation (Prediction Period: January 1979-December 1983) 


Summary Siaiistic 


Root mean squared percentage error 
Mean absolute percentage error 
Mean pei tentage error 
Regression coefficient 
Thed's inequality coefficient 
Fraction of error due to 
Bias 

Diffetent variation 
Different covariation 
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aggregate should behave with respect to income, prices, and interest 
rates. The answer is generally in the affirmative. Since Ml is the 
standard narrow aggregate and its behavior is well known, the com¬ 
parison of My with Ml is helpful in making this assessment. 20 Extra 
attention is devoted to this comparison during certain episodes pre¬ 
cisely because it is known in advance that during these episodes the 
behavior of Ml was anomalous relative to its historical pattern, and 
we wish to know whether My similarly misbehaved. 21 The empirical 
evidence suggests that it did not. 

IV. Concluding Remarks 

This paper has presented a method for constructing a measure of the 
money stock that incorporates information on how monetary assets 
are used to mediate exchange as this is revealed in the turnover statis- 

20 Though less well known, the Divtsia aggregates have been extensively compared 
relative to conventional aggregates—see, c.g., Barnett et al. (1984 )—and are of sepa¬ 
rate interest as monetary index numhers. For these reasons they are included in this 
paper. In view of the compelling theoretical foundations of the Divisia aggregates, the 
apparently poor empiric al performance of Divisia L in the econometric relations con¬ 
sidered here may seem strange. But this is not necessarily so. The view of money 
underlying these relations is as a narrowly defined stock of transactions-oriented bal¬ 
ances They should not necessarily be expected to hold lor the capital-theoretic service 
flow conc ept of money measured by the Divisia aggregates. 

21 Note that we are not committed here to any specific hypothesis regarding the 
source of the anomalies in MI behavior—whether, e.g., there was some sort of techno¬ 
logical ''shift" in money demand or if M l misbehaved because of the binding effects of 
various regulatory controls, such as Regulation Q ceilings. The point is that insofar as 
any of these ef fec ts gives rise to changes in the payments mechanism or induces differ¬ 
ential changes in the use of different means-if-payment goods, Mq may be expected to 
be better behaved under these circumstances than a conventionally defined aggregate. 
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tics. It has been found that the procedures discussed here produce a 
very reasonable-looking monetary aggregate, and one which corre¬ 
sponds closely with the transactions aggregate contemplated in the 
equation of exchange MV = PQ. 

It is not the point of this paper to advocate the replacement of the 
conventional aggregates by the experimental measure Mq. Rather, 
the purpose has been to develop a measure that may be helpful in 
explaining certain anomalies in the behavior of the conventional 
aggregates. Indeed, it has been shown that during “normal” periods, 
the conventional narrow monetary aggregate Ml closely approxi¬ 
mates the behavior of the M in MV = PQ, at least as this is measured 
by Mq. At the same time, the results presented here suggest that it 
would be a sensible strategy to develop Mq further and to monitor it 
on an ongoing basis. 
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The Macroeconomics of a Transfer Program 
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Food stamps represent nearly $ 11 billion of personal income in the 
United States. The coupons that are issued to provide purchasing 
power to recipients are also reserves for the commercial banking 
system. This study asks at what rate these coupons are substitutable 
for what is usually considered money. The results, based on esti¬ 
mates for 1959-81, suggest that food stamp coupons are substitut¬ 
able for Ml on a one-for-one basis, and a revised money supply 
series including “Food Stamp Money” is included. Together with 
other estimates showing that the marginal propensity to consume 
out of food stamps is higher than that out of ordinary income, the 
results suggest that the food stamp program is an automatic fiscal 
and monetary stabilizer: under its provisions, both the money stork 
and disposable income are increased during a recession. 


I. Introduction 

The food stamp program in the United States has grown into one of 
the largest noncategorical income maintenance programs run by the 
federal government. In 1982 nearly $11 billion worth of stamps were 
paid out to households containing 22 million members. Food stamps 
have become the negative income tax that was never enacted. They 
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are generally available, offer a minimum guarantee, and are reduced 
by some fraction (now .82) for each dollar of additional countable 
income the household obtains from other sources. 

Unique among income maintenance programs, food stamp benefits 
are not paid as checks, cash, or reimbursements to vendors, but rather 
as specially printed stamps that eligible recipients obtain at certified 
disbursement outlets near their homes. (Until 1979 recipients were 
required to exchange cash for food stamps with a larger face value— 
the so-called purchase requirement.) These stamps in turn are used 
to purchase qualifying commodities. Thus, food stamps serve two 
economic f unctions. They provide extra income to (some) consumers 
and therefore serve as an automatic fiscal stabilizer; they also function 
like money, in that they serve as a medium of exchange for (at the 
very least) food transactions. In this study we examine the “money¬ 
ness” of this unusual program by measuring the marginal rate of 
substitution between food stamps and narrowly defined money, Ml. 

This analysis is motivated by two observations. First, food stamps 
serve as a medium of exchange and are, in fact, used as a substitute 
for currency or demand deposits in many transactions. 1 Second, the 
food stamp program, and in particular “food stamp money” (which is 
different from food stamps issued, as explained in the next section) 
became substantial at about the time that money-demand equations 
(Goidfeld 1976) began to underpredict the money stock. 2 

The paper is organized as follows. Section II describes the deriva¬ 
tion of food stamp money and the estimation procedure utilized in 
this study. Section III presents empirical results on the moneyness of 
food stamp money, and Section IV summarizes the results and dis¬ 
cusses the full macroeconomic impact of the food stamp program in 
light of the results. 


1 There are frequent leports in the popular press, of this use For example, one 
official of the Department of Agriculture (which administers the program at the fed¬ 
eral level) stated (Time. August 23. 1082), "The [food stamp) coupons are a second 
c iirrency. Anything you can huy with money, from electronics to houses to sex, you can 
buy with Food Stamps." The article continues with reports that federal agents have 
used coupons to buy boats, cars, a gun with a silencer, marijuana, and even a $35,000 
house. 

‘ There is another interesting aside to the food stamp program. Because of the 
accounting among the Department of Agriculture, the Treasury, the Federal Reserve, 
and banks, food stamps raise the monetary base until they clear the broadly defined 
banking system and are financed by the Treasury. Briefly, this is because food stamps 
deposited by banks at the Federal Reserve are credited to the banks' reserve accounts. 
1 his base float is eliminated by the ultimate financing. This quasi Hoat is not peculiar to 
the food stamp program, however, and oct urs whenever the government writes a check 
cm itself. The difference is that food stamp coupons in circulation serve as a medium of 
exchange. 
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II. Data and Estimation 

The quarterly food stamp money data used in this study were con¬ 
structed from monthly data on food stamp issuances (FSI) and food 
stamp redemptions (FSR). Monthly food stamp redemptions were 
constructed by linearly interpolating annual data on food stamp de¬ 
structions. 3 Given these data on issuances and redemptions, food 
stamp money (FSM) in month t is defined as 

FSM, = S, + ,5FSR„ (1) 

where S, is the dollar amount of food stamps outstanding at the end of 
month t and FSR, is the average amount of food stamps redeemed 
(used up) in t (so that we can assume .5FSR, is outstanding on average 
during t). S, is defined as 

S, = S,_, + FSL„ (2) 

where FSL, is the dollar amount of new issuances in t less the amount 
of those new issuances used up that period, that is, 

FSL, = FSI, - FSR,. (3) 

The series on FSM for 1959-81 is shown in table 1 along with M1 and 
the gross dollar value of FSI for the fourth quarters of 1959-81. 4 

To examine the role of food stamp money as a medium of ex¬ 
change, one would ideally follow a two-step procedure. The first step 
would set up a complete set of asset-demand equations and then 
estimate the elasticity of substitution between food stamp money and 
the components of Ml (assuming Ml is the appropriate medium of 
exchange) to determine if food stamp money and M1 are indeed 
substitutes. This is analogous to Chetty’s (1969) procedure. The sec¬ 
ond step, assuming they are substitutes, would be to estimate the 
marginal rate of substitution between food stamp money and M1 to 
determine if it would be appropriate simply to add food stamps to M 1 
to construct a true medium-of-exchange monetary aggregate. Unfor¬ 
tunately, the first step requires data on the relative price of (return to 
holding) food stamp money and Ml. These data do not exist. Conse¬ 
quently, we assume faute de mieux that food stamp money is a substi¬ 
tute for Ml (as the comments in n. 1 indicate) and concentrate atten¬ 
tion on estimating the marginal rate of substitution between food 
stamp money and Ml, which we refer to as the “moneyness” of food 

S Monthly issuances of food stamps over the period covered in this study were pro¬ 
vided to us by the U S. Department of Agriculture. Annual destruction data can be 
found in the Annual Reports of the Board of Governors of the Federal Reserve System, 
the entire series on FSM is available from us on request. 
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TABLE I 

Food Siamp Monly and Ml, 1 959— K 1 (in Billions of $) 


Fourth Quarter 

FSM 

FSI 

(Annualized) 

Ml 

1959 

0 

0 

141 2 

I960 

0 

0 

142.0 

1961 

.01 

.03 

146 0 

1962 

02 

.04 

148 7 

1963 

.02 

07 

154.6 

1964 

02 

.07 

161.4 

1965 

01 

.15 

168.5 

1966 

02 

.26 

173.2 

1967 

03 

.44 

184.3 

1968 

.04 

.57 

197.8 

1969 

05 

76 

205 5 

1970 

20 

2.28 

215 5 

1971 

36 

3 26 

229 9 

1972 

.45 

3.84 

249.5 

1973 

41 

4.21 

263 9 

1974 

52 

6.84 

276 4 

1975 

1 24 

8.53 

290 2 

1976 

2 00 

K 36 

308.1 

1977 

2 HI 

8.03 

333 3 

1976 

3.HI 

8.21 

360.8 

1979 

3 15 

7 76 

387 5 

I960 

3.06 

9.02 

415 8 

1981 

4 12 

10 40 

425 3 


stamps. Given Chetty’s (1969) evidence on the very large substitution 
elasticity between money and time deposits and money and other 
assets, the forced assumption that the elasticity of substitution be¬ 
tween FSM and M 1 is infinite seems minor. 

To measure the moneyness of food stamps, consider a general 
short-run adjustment equation describing the demand for money: 

In Af = A In A1 ., + yX + t, (4) 

where M is a measure of the stock of money, X is a vector of variables, 
X and y are parameters to be estimated, e is an error term, and 
the subscript denotes a lag. Without discussing the specific form 
of the money-demand equation (the measure of the stock of money 
or the vector of variables included in X), we can rewrite (4) to include 
food stamp money as 

ln(Af + aFSM)'= \ln(M_i + aF'SM-.)) + yX + e, (4') 

where a is a measure of the moneyness of food stamps, 1 3 s a 5* 0. 
The estimate of a indicates the rate at which holders of money substi- 
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tute the outstanding stock of food stamps for what is ordinarily 
defined as money. If a = 1, FSM is performing the same functions, in 
terms of households’ and businesses’ demand for money, as Af. 

Equation (4') is estimated using data covering 1959:11-1981 :IV. 
The disturbance term is specified as e = pe_ j + v, where v is assumed 
to be white noise. To derive the parameter estimates in (4') the likeli¬ 
hood function describing the equation (presented in the App.) is 
maximized by searching the grid of values of a over the closed inter¬ 
val [ 0 , 1 ]. 


III. Estimates of the Moneyness 
of Food Stamp Money 

T he first step in estimating (4') is to specify some explicit functional 
form for money demand. Unfortunately, there is no single money- 
demand specification that enjoys a consensus among economists (see 
Hafer and Hein 1979). Consequently, we present results for three 
well-known specifications of the demand for money. These are the 
Goldfeld (1976), Friedman (1978), and Hamburger (1977) specifica¬ 
tions. 

Formally, we estimate the following money-demand equations for 
various values of a over the period 1959:1—1981:1V, and for two 
subperiods, 1959:1-1974:1 and 1979:11-1981:1V: 


In 


In 


In 


M, + aFSM, 

Pt 


Af, + aFSM, 

P, 


M, + aFSM, 

Y, 


= 60 + 61 In + b 2 In RCP, + 63 In R I D, 

• t 


+ 64 Ini 


Af,. 1 + aFSM, . 


Pi- 1 


4 


= c„ + c | ln^-^-j + c 2 In RCP, + r 3 In RTD, 


+ C 4 In 


Af, 1 + aFSM,^ 1 

/VI 


( 5 ) 


( 6 ) 


= do + rf, In DPR, + d 2 In RGL, + In RTD 


+ d 4 lnl 


'Af,_ , + aFSM, _ 1 ) 

, VTi ) 


( 7 ) 


where Af is shift-adjusted M1B, Y is nominal GNP, P is the GNP 
deflator, RCP is the commercial paper rate, RTD is the rate on time 
deposits, W is net private-sector wealth, DPR is the dividend-price 
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TABLE 2 

l.cx.-l.iKH.iiiooi) Values k>k Ai.iernaiivf. Money-Demand Specifications 



(I-Statistics in 

Parentheses) 



SfLOIHCA'J ion 


Period 



1959.11-1981 IV 

1959:11-1974:1 

1974:11-1981:1V 

a - 0 a = 1 

a = 0 a = 1 

a = 0 

a = 1 

Goldfeld—(5) 

- 164 68 -164.36 

-44.17 -45.09 

-71.52 

-71.21 


(.81) 

(-1.36) 

(.78) 


Using consol 

-134.03 -13144 

-35.84 -36.19 

-62 37 

-62.16 

rate 

(2.28) 

(-.83) 

(.66) 


Fnedmati -(<>) 

- 16. r > 03 - 164 77 

-17.96 -17 76 

-71.90 

-71 64 


(.71) 

(64) 

(.72) 


Using consol 

- 135.86 -131.40 

-9 58 -8 36 

-63.33 

-63 12 

i ate 

(2 99) 

(1.56) 

(.65) 


Hamhurgci —(7) 

4H6 10 486 67 

355 64 355.83 

156 60 

156.96 


(1 07) 

(76) 

(.85) 


Using consol 

495 83 495 98 

353.25 353.51 

164.86 

165.16 

rate 

( 55) 

(.72) 

(.76) 



latio, and RGI. is the rate on long-term government bonds. 5 Equation 
(a) is the (loldfeld specification. (6) is Friedman’s, and (7) is Hambur¬ 
ger’s. In addition to results based on (5)—(7), estimates using equa¬ 
tions like (5)-(7), but with a measure of the yield on consols sub¬ 
stituted for other rates (see Ainsler 1984), are also presented. 

I he equations were estimated for subperiods for two reasons. First, 
food stamp money is relatively unimportant before 1974 (see table 1). 
Second, it is well known that conventional money-demand functions 
such as (5)-(7) exhibit some instability after 1973. Hafer and Hein 
(1982, p. 11) have argued that the apparent instability is due to a 
once-and-for-all level shift in the intercept of the money-demand 
function around 1974:11. To account for this shift we have included a 
dummy variable for the period 1974:I1-1981:IV in the regression for 
the whole sample period. As is standard in the money-demand litera¬ 
ture, each equation was estimated using the Cochrane-Orcutt proce¬ 
dure. The estimates of the b„ c„ and d , are close to those that have 
appeared elsewhere. 

Table 2 presents the values of the log-likelihood f unction for each 
of the three money-demand specifications in the different sub¬ 
periods, for a = 0 and a = 1. (The likelihood function always 
reached its maximum at one of these end points.) The numbers in 


’’ All the data except M1B ancl the food stamp data came from the FMP and Citibase 
data banks Shift-adjusted M1B was taken from Board of Governors of the Federal 
Reserve System, “Revised Money Stock Data—March 1982.” 
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Year 

(1) 

Average Annual 
Missing Money ($ Billions) 
(2) 

Average Annual 

Food Stamp Money ($ Billions) 

(3) 

Fraction 
(3) - (2) 
(4) 

1974- 

3.58 

.48 

.13 

1975 

5.33 

1.00 

.19 

1976 

5.62 

1.77 

.31 

1977 

6.12 

2.59 

.42 

1978 

7.59 

3.45 

.45 

1979 

8.30 

3.38 

41 

Average 

6.09 

2.11 

32 


“ l.ast 3 quarters 


parentheses are ^-statistics testing the null hypothesis that a = 0 
against the alternative that a^O. Two results stand out. In all cases 
except two (the Goldfeld equations for the first subperiod) the log- 
likelihood function is larger for a = 1 than for a = 0. Also, the l- 
statistics indicate that in some cases one can reject the hypothesis that 
a = 0, albeit at fairly low confidence levels. Based on this evidence, it 
seems reasonable to conclude that the marginal rate of substitution of 
food stamp money for M1 is unity. 

We have shown that food stamp money acts like Ml but is not 
included in any current definitions of money and that food stamp 
money begins to grow rapidly around 1974 (see table 1). Perhaps, 
then, food stamp money is the “missing money" economists have been 
searching for since Goldfeld (1976 ). 6 The fraction of “missing money" 
that might be accounted for by food stamp money is presented in 
table 3. Column 2 of table 3 displays the average annual amount of 
(nominal) missing money, defined (as is frequently done) as the static 
forecast error of the Goldfeld money-demand equation from 1974 to 
1979. 7 Column 3 displays the average annual amount of nominal food 
stamp money. The last column is the ratio of food stamp money to 
missing money. While food stamp money does not account for all the 
missing money, it dearly accounts for a sizable part of it and therefore 
contributes something to the resolution of an important empirical 
puzzle. 


The term derives from conventional money-demand funitions' consistent overesli- 
mation of the amount of money (Ml) in circulation since 1974. This overestimate has 
been labeled the "missing money." 

The Goldfeld equation used to generate these forecasts is (5), estimated over the 
penod 1959:11-1974:1. 
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IV. Conclusion and Further Comments on the 
Macroeconomics of Food Stamps 

The results presented in this paper indicate that food stamp money 
acts like Ml and therefore should be included in definitions of 
money. It should be noted, however, that food stamp money is not the 
same as food stamps issued. All stamps issued are not redeemed 
immediately; consequently, the stock of outstanding stamps must be 
carried forward in calculating food stamp money. One important 
implication of the moneyness of food stamps is that, when the amount 
of food stamps issued rises in a recession, the true money stock rises 
more rapidly than that published by the Federal Reserve. Thus food 
stamps are an automatic money stabilizer. 

There is another important dimension to the food stamp program. 
Food stamps have been shown to add little to the amount of food 
consumed by recipients (Clarkson 1976; MacDonald 1977) or to the 
nutritional value of the food being purchased (Whitfield 1982). That 
being the case, the income freed by the recipients of food stamps must 
be either spent on other goods or saved. Elsewhere (Hamermesh and 
Johannes 1983) we have shown that the MPC out of food stamp 
income is at least as great as that out of ordinary income. 8 This sug¬ 
gests that food stamps enable recipients, many of whom have tem¬ 
porarily low incomes, to maintain consumption nearer to their 
lifetime optimizing consumption paths. This means that, because 
food stamp payments increase during a recession, the high propensity 
to spend them enables the food stamp program to function as an 
effective automatic fiscal stabilizer of aggregate demand. 

In summary, food stamps represent a large transfer payment that 
varies cyclically and that inherently changes aggregate demand 
through both the goods market and the money market. In a mac¬ 
roeconomic context the food stamp program is both fiscal and mone¬ 
tary policy. !l 

Appendix 

T he concentrated log-likelihood function for (4') is: 

L a = ^f [ln(27T) + 1] - f + X '"A- 

" Hamermesh (1982) demonstrates this same eflect tor payment loi unemployment 
insurance. 

q Blinder and Solow (1974, p 4) stale, "[A] transaction is pure fiscal policy if it is 
financed entirely with taxes, so that the public debt does not change, or if the debt- 
financed part of the expenditure does not alter the proportions of outstanding govern¬ 
ment obligations (including high-powered money)." By these criteria food stamps 
clearly are a mixed policy. 
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where SSE is the sum of squared errors from least-squares estimation of (4') 
after the correction for autocorrelation; a has been fixed at a particular value; 
,V is the number of observations; and 

/ = _ \ _ 

(M, + aFSM,)* ’ 

where * denotes the adjustment for autocorrelation. 

lny, = -N ln(l - A) - N in(M + aFSM); 
the last term is just the mean of the dependent variable in (4‘). 
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Exchange Rate and Trade Instability: Causes, Consequences, and Remedies. Edited 

by David Bigman and Teizo Taya. 

Cambridge, Mass.: Ballinger Publishing Co., 1983. Pp. 340. $35.00. 

As the title implies, this book is concerned with different manifestations of 
exchange rate and trade instability, their causes and effects, and some possi¬ 
ble remedies. It is not intended as a unified or comprehensive appraisal of a 
decade of experience with managed floating. Rather, it consists of a collection 
of 13 essays by specialists in the field, each one focusing either on some set of 
empirical characteristics of floating rates (Frenkel, Bigman and Lee, and 
Dooley and Shafer), on the theory of exchange rate determination (Bigman, 
McKinnon, and McNelis and Condon), on some aspect of trade instability 
(Knudsen and Harbert, Kawai, and Chu et al.), or on individual-country 
expel ien<e in managing or living with floating rates (Taya, von Furstenberg, 
Michaely, and Otani). As is often the case with such collections, the quality of 
the individual chapters varies considerably and the treatment of topics is quite 
uneven (e.g., there is not much discussion of why exchange rate and trade 
instability is costly or of how to define the equilibrium exchange rate). Still, 
tbeie aie enough good pieces to make the book a potential supplementary 
source foi some graduate courses in international economics. 

Let me just concentrate here on what are perhaps the few papers of widest 
interest. 

Jacob Frenkel, in the opening chapter, provides a clear and well-argued 
commentary on the turbulence in foreign exchange markets over the 1973- 
81 period. He notes that during the 1970s the foreign exchange value of the 
dollar was highly volatile, that changes in exchange rates were by and large 
unpredictable, that exchange rate movements did not conform closely to 
movements in national price levels, and that a favorable U.S. interest differ¬ 
ential was associated with a falling U.S. dollar for most of the 1970s but with a 
rising dollai from mid-1979. 

Frenkel goes on to argue: (i) that the volatility and unpredictability of 
exchange rates are best interpreted in terms of the by-now familiar "asset 
market theory" of exchange rates (where current rates reflect expected future 
events and where “news” drives most exchange rate changes, especially in 
turbulent times); (ii) that the failure of purchasing power parity (PPP) to hold 
should be expected in periods of mostly “real" shocks and, even more gener¬ 
ally, because exchange rates are flexible and forward looking while national 
price levels are sticky and backward looking; and (iii) that the exchange rate- 
interest rate puzzle for the dollar is best explained by the distinction between 
nominal and real interest rates, with high nominal rates during most of the 

[Journal uf Political Economy, 1985, vol 93. no 1] 

<0 1985 by 1 he Urmenity of (.hicago All right* lejcrved 


214 



BOOK REVIEWS 215 

1970s spelling dollar depreciation and high real rates since mid-1979 associ¬ 
ated with appreciation of the dollar. 

Policy implications follow in line. Do not restore fixed parities; do not adopt 
a rigid PPP rule for exchange rate management; and reduce high and vari¬ 
able rates of monetary expansion if you want both to restore price stability 
and to reduce costly turbulence in exchange rates. 

If 1 have any regret about the Frenkel paper, it is that it does not pay more 
attention to competing hypotheses. For example, it would have been nice to 
have Frenkel’s view on what role, if any, in past exchange rate volatility could 
be ascribed to structural changes associated with shifts in policy regimes (i.e., 
time-varying parameters), or to speculative “bubbles,” or to variable risk pre¬ 
mia, or to impediments to capital mobility, or to “overshooting” based on 
sticky price adjustment—rather than to “news” alone. Similarly, Frenkel is 
surely right in suggesting that nominal interest rates can be a poor indicator 
ol monetary stance during periods when inflationary expectations are high, 
and that the exchange rate can be a useful supplementary indicator in such 
times. But it would likewise seem prudent to note that monetary aggregates 
ran be difficult to interpret during periods of structural change in financial 
markets and that the exchange rale can sometimes send false signals. In any 
case, some of the “other” causes, consequences, and remedies not addressed 
in the Frenkel paper are discussed in other chapters, with, for example, 
McNelis and Condon writing about time-varying parameters in the determi¬ 
nants of exchange rates, McKinnon on capital constraints and other impedi¬ 
ments to stabilizing speculation, and Taya on official exchange market inter¬ 
vention. 

A different perspective on the behavior of major-currency exchange rates 
over the 1973-81 period is offered in the paper by Michael Dooley and 
Jeffrey Shafer. Their purpose is to test two contrasting views of exchange rate 
determination—the "price dynamics” view, which stresses the role of per¬ 
ceived trends in traders’ expectations and can generate an exchange rate path 
only loosely related to fundamentals, and the “efficient markets” view, which 
denies the existence of patterns in exchange rates that can be exploited for 
profitable private position taking or for government intervention. The tests 
utilize daily data on (spot) dollar-exchange rates for nine major currencies. 
Autocorrelations, runs tests, and filter rules are employed to test whether the 
data are consistent with the efficient markets model (i.e., whether spot ex¬ 
change rates follow a martingale). In short, Dooley and Shafer find that ”... 
a simple model of exchange rate determination that assumes both speculative 
efficiency and risk neutrality is not supported by the data. . . . Exchange 
rates continue to behave in ways that provide substantial potential for profit" 
(pp. 46-47). 

Rejecting the efficient markets model is one thing. Explaining why it was 
rejected is quite another. Here, Dooley and Shafer trot out a long list of 
candidates (including credit risk, capital controls, limited speculative re¬ 
sources, risk aversion, central bank intervention, and transactions costs) but in 
the end can offer only the suspicion that it is differences in equilibrium rates 
of return across assets denominated in different currencies, rather than 
inefficiencies in processing information, that is the likely culprit. The reasons 
one cannot go much further have been nicely put forward by Levich (1984) in 
his recent survey of empirical studies of exchange rates: 

However, a convincing empirical test of efficiency in the foreign 

exchange market is made difficult because there is no general 
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agreement on models for equilibrium pricing or equilibrium rates 
of return which is comparable to that for equity markets. Simply 
put, it is difficult to test whether investors efficiently set the actual 
spot exchange rate equal to its equilibrium value, unless there is 
some agreement on what the equilibrium value is. Similarly, it is 
difficult to test whether risk hearing is efficiently compensated if 
there is no agreement on the fundamental nature of foreign ex¬ 
change risk, an adequate measure of f oreign exchange risk, and a 
model that determines the equilibrium fair return for bearing 
foreign exchange risk. [ Pp. 52-53] 

Finally, mention should be made of several of the individual-country stud¬ 
ies m the volume that provide information on the effectiveness of policies 
designed to influence exchange rates. In this connection, Teizo Taya analyzes 
whether daily exchange market intervention by the Bank of Japan from 
Octobet 1077 to December 1979 had any significant effect on the speed of 
exchange rate movements (rather than on the equilibrium exchange rate 
itself). His conclusion, in brief, is that such intervention did not significantly 
reduce fluctuations in the yen/dollar exchange late, except perhaps on occa¬ 
sion and then for only very short periods (a week or so). In a similar vein, 
Ichiro Otant studies the effect on the exchange rate for the yen of capital 
conttol measures imposed by the Japanese authorities for the period 1978- 
81. His verdict is that while these control measures were qualitatively consis¬ 
tent with their objectives (i.e., measutes intended to discourage inflows of 
capital into Japan depreciated the yen, while those to discouiage outflows 
appteciated it), they tended to be quantitatively ineffective because the in¬ 
duced transactions costs led to quite small changes in exchange rates. Last but 
not least, Michael Michaely's interpretation of Israel's experience with a float¬ 
ing exchange rate over the period 1977-80 is interesting because the major 
part ol Israel's money is indexed to the foreign exchange rate (so-called 
Isradollars) and because Israel's domestic inflation rate was so high and vari¬ 
able over that period Perhaps his major conclusion is that with such indexed 
money the deterministic role of money in containing an inflationary process is 
mostly lost and (hat the monetaiy system is thus rendered more unstable. 


International Monetary Fund 


Morris Goldstein 
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The Federal Lands Revisited. By Marion Clawson. 

Washington: Resources for the Future (distributed by Johns Hopkins Univer¬ 
sity Press), 1983. Pp. 302. $25.00 (cloth); $8.95 (paper). 


In 1983 fedeially owned resources supplied 16 percent of U.S. oil produc¬ 
tion, 30 percent of gas produc tion, and 13 percent of coal production. Fed- 
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eral resources are expected to provide an even larger share of future U.S. 
energy. The federal government owns, for example, about one-third of the 
coal reserves of the United States, almost all located in the West (Nelson 
1983). 

The federal government also owns a vast surface estate, equal to about 30 
percent of the land area of the United States. Federal lands supply around 20 
percent of U.S. timber harvests from an inventory that exceeds 50 percent of 
the total U.S. inventory of softwood timber. Livestock grazing occurs on more 
than 200 million acres of federal lands—an area twice the size of California. 
About 80 million acres of federal lands have been set aside since 1964 as 
specially protected wilderness areas. In short, the federal government is di¬ 
rectly responsible for managing a significant share of the natural resources of 
the United States. 

The management of these resources raises economic issues normally en¬ 
countered in the study of comparative economic systems. The discipline of 
the market is absent from federal production decisions, which are made by 
large bureaucracies that never face a requirement to make a profit, to say 
nothing of maximizing profits. Although the actual production of federal 
resources is generally carried out by private firms, these firms operate under 
leases or other contractual arrangements that dictate many of the details of 
production. For example, leases to produce oil, gas, or coal on federal lands 
contain “diligent development requirements” that dictate that production 
must commence within a specified period—in many cases 10 years—or the 
lease will be forfeited. 

For many years scholars gave little attention to federal resource manage¬ 
ment. A major exception was Marion Clawson, an economist and the director 
of the Bureau of Land Management in the Interior Department from 1948 to 
1953. Shortly after Clawson joined Resources for the Future in 1955 he to- 
authored The federal Lands (Clawson and Held 1957), a work judged in a 
recent survey to lie among five classics of the public land scholarly literature 
(Fairfax 1982). Clawson has published numerous othet books and articles, 
making him the foremost U.S. authority on federal land management. His 
latest contribution. The Federal Lands Revisited, is irt some ways his boldest 
effort. For years, federal retention of the federal lands was considered a 
settled matter. Clawson now suggests, however, that the time has come to 
“think the unthinkable” (p. 14), to exploie the possibility of divesting the 
federal lands to private ownership, of transferring them to state ownership, 
or of adopting other radical land tenure alternatives. 

Clawson's new book is part of a rapidly growing literature during the past 
decade on federal resource management. At first this literature focused on 
the characteristics of federal management itself. Indeed, Clawson was a major 
contributor here as well. In 1976, he characterized Forest Service manage¬ 
ment of the national forests as “disastrous” (Clawson 1976*. p. 763) and 
indicated that “a resource management record of this kind is unacceptable for 
either privately or publicly owned natural resources" (Clawson 1976a, p. 99). 
Along with other researchers, Clawson found that little consideration was 
being given to economic concerns in managing federal lands and minerals. As 
a result, federal management exhibited pervasive inefficiencies, including 
allocation of land to lower over higher-value uses, investments for which the 
costs substantially exceeded the benefits, investments made in one place 
where other places would offer higher returns, the conservation of non¬ 
renewable resources where their immediate production would be appropri- 
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ate, or, conversely, current production where reserving resources for the 
future instead would be desirable. The Forest Service, for example, was sell¬ 
ing substantial amounts of timber for harvest although the costs to the gov¬ 
ernment of holding the timber sale exceeded the revenues received by the 
government. Compounding the problem, some of this timber was located in 
prime recreational areas of the nation where timber harvesting was especially 
damaging to the environment (Hyde 1981). 

The Forest Service operates at a large deficit, despite the availability to it of 
large supplies of timber and other natural resources for which it has had to 
make little or no investment. Most of the timber harvested from the national 
forests, for example, still comes from areas that have never been cut before. 
Yet in 1980 the appropriations for the national forests were $1.7 billion, while 
the revenues collected were $655 million. On 170 million acres of public 
grazing lands administered by the Bureau of Land Management, grazing 
management costs exceed revenues collected by a factor of five to 10. Then, 
after federal land and minetal revenues are collected, a large share is typically 
turned over to states and counties. For example, 50 percent of federal reve¬ 
nues from mineral leasing are turned over to the state in which the minerals 
are located, laical counties receive 25 percent of Forest Service gross receipts 
from timber sales, even in those cases where the sale costs to the federal 
government exceed the gross revenues collected. 

Reviewing the deficits and other failings of federal resource management. 
Clawson now affirms his earlier assessment, stating that “anyone who has 
been a member of the Federal land-managing bureaucracy, as 1 have, . . . 
can agree with much of the criticism regarding inefficiency in the Federal 
agencies There arc indeed many pressures that result in inefficiency, and few 
tewards for efficiency" (p. 163). The strong critics of federal resource man¬ 
agement include not only economists, but also environmentalists, industry 
representatives, and tithet user groups. 

As a growing body of economic studies during the 1970s showed major 
management problems with federally owned resources, a search for remedies 
began. Not surprisingly, there is wide disagreement on the best answers. 
Indeed, a main purpose of Clawson's book is to lay out alternatives for consid- 
eiation. Clawson is too much the impartial social scientist to take a strong 
advocacy position. However, a little reading between the lines does suggest 
some views. 

One alternative would be to retain federal resources in lederal ownership, 
but to improve significantly the quality of federal resource management. In 
exploring tins alternative, Clawson delves into the question, Why should the 
lederal government own so much land and so many minerals in a country 
ostensibly committed to private enterprise and decentralization of govern¬ 
ment responsibilities? The best answers Clawson can find are surprisingly 
weak Federal land ownership may best be explained as an accident of history. 
But it is also argued that the federal government will take a longer-run view 
of natural tesource needs, conserving natural resources that private owners 
might squander. As compared with regulatory mechanisms, public ownership 
may also provide an easier and more secure means to protect the federal 
lands against adverse impacts that are external to the market system. Finally, 
some federal lands may sustain social values that would be incompatible with 
market provision. Clawson suggests that some of the current proponents of 
wilderness areas really regard them as religious symbols; market provision 
would profane these “cathedrals” of our day. 
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Assuming, as seems the most likely prospect, that federal land and mineral 
resources will be retained by the federal government, there are a number of 
ways to improve their management. Clawson states that, as a first step, it 
would be necessary to apply a “standard of economic management” (p. 179) 
under which federal actions would be subject to much closer scrutiny to 
determine whether the benefits actually exceed the costs. Improved economic 
planning of federal resource use would generally be a key step in introducing 
greater concern for efficiency in management calculations. Clawson has long 
argued that federal agencies should develop a capital account, if they are 
properly to measure rates of return on alternative investments. Finally, he 
also proposes that the federal resource agencies should impose higher user 
fees and charges from which they should be allowed to fund their own opera¬ 
tions and investments. The overall effect of these management changes 
would be to put federal resource management on a much more businesslike 
basis. 

Nevertheless, reflecting his doubts about the prospects lor actually achiev¬ 
ing major improvements in federal management, Clawson examines closely 
ihe case for sale of federal lands to private owners. He notes that simply 
giving away many federal lands would benefit national taxpayers because 
these lands cost the government more to manage than they yield in revenues. 
Piivate owners would probably be better managers, partly because they would 
be more attentive to costs. In general, private ownership would offer the 
information efficiencies and entrepreneurial incentives of the market. Thus, 
111 principle, Clawson favors greater private resource ownetship, stating that 
“I, at least, . . . accept much of the argument of the private enterprisers” (p. 
103). He doubts, however, that the inertia of many years of federal land 
ownership is easily reversed. Strong interest groups benefit from the status 
quo of federal resource ownership, while the beneficiaries from major tenure 
or other institutional change are more diffuse and thus less politically influen¬ 
tial. Much as federal regulatory agencies have often been captured in the 
past, so federal resource management agencies also develop strong support¬ 
ing clienteles. 

An alternative to federal or private ownership would be to transfer (he 
federal lands to the states. Clawson notes that proposals for cession of federal 
lands to the states have recurred throughout American history, most recently 
as part of the agenda for the "Sagebrush Rebellion." Western resentment of 
Carter administration policies boiled over in 1979 and 1980, leading many 
westerners to demand radical institutional changes. However, the election ol 
the Reagan administration then dissipated much of the force of this 
movement. 

Clawson is ambivalent about transferring federal land and mineral re¬ 
sources to the states. Historically, he notes that the slates have had a poor 
record as land managers, although recently there have been some significant 
improvements. Indeed, state resource management might combine the worst 
of both worlds—achieving neither the advantages of private ownership nor 
the management expertise of the federal government. Moreover, the states 
would incur new management costs, exceeding in many cases the amounts 
that they would gain in new revenues from taking over surface lands. Thus, 
the states might well decline to accept a proposed transfer to them of federal 
surface lands. 

Further tenure alternatives include the creation of public corporations to 
manage federal resources or the long-term leasing of federal lands. A public 
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corporation would seek to achieve greater insulation of land management 
decisions from political interference, making it possible to give greater atten¬ 
tion to economic efficiency. Long-term leasing would grant more secure ten¬ 
ure to ranchers and other users of federal resources, allowing for a greater 
private user role in investment and other management decisions. For most 
prat deal purposes, a 100-year lease is equivalent to the sale of the resource; 
yet leasing might well prove to be politically more acceptable. 

Indeed, Clawson is enthusiastic about long-term leasing. He develops a 
detailed illustrative proposal, involving 100-year leases lot prime timber 
lands, Shorter leases of 50 years are proposed for high-quality grazing lands, 
lor wilderness areas and other special recreation lands, and tor lands to be 
selected by stale and total governments. Clawson proposes a number of 
specific terms of the leases, even going so far as to specify appropriate rental 
rates for each lease category. 

There aie major economic and physical differences among federal lands 
and minerals. A tenure lorm suitable for one circumstance might not be 
suitable for another. Hente. it might be appropriate to retain some federal 
lands, lease others, turn some areas over to states, and, finally, to sell off still 
othet federal lands Clawson closes his book with a plea for a spirit of intellec¬ 
tual entiepieneurship in these matters. As he says, "1 think that researchers, 
land manageis, and the intellectual community generally should be en¬ 
couraged to propose social inventions for Federal land management. Most of 
the proposals will prove somewhat impractical, hut on the whole, they may 
prove highly useful. Lei us tiy” (p. 273). 

The Federal Lands Revisited has all the hallmarks that have become lamiliar to 
readers ol Clawson’s past books and articles. He is fair minded and shows 
much common sense, while shedding new light on many topics. His book 
provides many useful details as well as other valuable information about 
iedetal land management. 1 he litetatuie in the held is knowledgeably sur¬ 
veyed and assessed. The current issues aie related to a long history of federal 
resource management in which these issues have often arisen before. 

Readers should not expect to find a book filled wilh high-powered modern 
quantitative theory Clawson's woi ks dec not develop formal models or lest 
hypotheses, nor do they propose dazzling new social concepts or theories. 
Clawson is much more the synthesizer, lie writes in the vein of an older 
tradition of institutional studies and political economy. If you have to pick 
one book to explore the latest thinking on government policies for federal 
land and mineral management. I'he Federal Lunds Revisited is it. Altogether, it 
is one more fine achievement in a career that began with Clawson’s first paper 
on federal resoutce management in 1936. 

Robert H. Nelson 

Of fire of Policy Analysis, US. Department of the Interior 
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Debt, Deficits, and Finite Horizons 


Olivier J. Blanchard 
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Many issues in macroeconomics, such as the level of the steady-state 
interest rate or the dynamic ef fects of government deficits, depend 
crucially on the horizon of agents. This paper develops a simple 
analytical model in which such issues can he examined and m which 
the horizon ol agents is a parameter that can he chosen arbitrarily. 
I lie first part characterizes the dynamics and steady state of the 
economy in the absence of a government, for using on the effects of 
the horizon index on the economy. The paper clarifies the separate 
roles of finite horizons and declining labor income through life in 
the determination of steady-state tntetest rales. The second part 
studies the effects and the role of fiscal policy. It clarifies the respec¬ 
tive roles of government spending, deficits, and debt in the determi¬ 
nation of interest rates. 


This paper characterizes the dynamic behavior of an economy where 
agents have finite horizons. It then analyzes the effects and the role of 
government debt and deficits. There is in general no simple aggre¬ 
gate consumption function in an economy composed of finitely lived 
agents, f his is because agents dif fer in two respects. Being of differ¬ 
ent ages, they have different levels and compositions of wealth. Hav¬ 
ing different horizons, they have different propensities to consume 
out of wealth. This systematic relation among wealth level, wealth 
composition, and propensity to consume makes exact or approximate 
aggregation impossible (Modigliani 1966). 
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In view ol this problem, the solution adopted by Diamond (1965) 
was to choose a very simple population and age structure, avoiding 
altogether the need for aggregation. The solution chosen in this pa¬ 
per is, instead, to make assumptions that allow aggregation. The cen¬ 
tral assumption is that agents face, throughout their life, a constant 
instantaneous probability of death p. Thus their expected life is (Mp)\ 
furthermore, it is constant throughout their life. Agents are of differ¬ 
ent ages and have different levels of wealth, but have the same hori¬ 
zon and the same propensity lo consume. This allows one to solve the 
aggregation problem. 

I he main advantage of this approach is its flexibility. If we think of 
(\lp) as the horizon index, we can choose it anywhere between zero 
and infinity and study the effects of the horizon of agents on the 
behavior of the economy. In particulat, by letting p go to zero, we 
obtain the infinite horizon case as a limiting case. 

The main drawback of tftis approach is that it captures the finite 
horizon aspeit of lile but not the change in behavior over life, the 
"life-cycle" aspect ol life. In that respect it is closer to the initial for¬ 
mulation of permanent income by Friedman (1957) than to that of 
life c ycle by Modigliani (199(1) It is well adapted to issues where the 
finite horizon aspect is important, such as issues of debt and deficits. It 
is poorly adapted to issues where differences in propensity lo con¬ 
sume across agents ate potentially important. 

Section I derives the behavior of both individual and aggregate 
consumption. The aggregate tonsu nipt ion has a particularly simple 
and tractable form: aggregate consumption is a Iineat function of 
aggregate financial and human wealth. Human wealth is the present 
discounted value of labor income acctuing in the future to those 
iintently alive. 

Sections II and III characterize the behavior of an economy of 
agents with finite horizons. Section II studies the dynamic behavior 
and steady states of both open and closed economies. Section Ill 
considers two extensions. The first focuses on the effects of the elas¬ 
ticity of substitution of consumption. The second allows for declining 
labor income through life, to capture the effects of the "saving for 
retiiement” motive on capital accumulation. The effect of finite hori¬ 
zons, per se, is to decrease capital accumulation. The effec t of declin¬ 
ing labor income is, however, to increase it. The net effect is ambigu¬ 
ous, and the resulting steady state may well, as in Diamond, be 
inefficient. 

Section IV introduces the government. Because the focus is on 
intertemporal reallocations of taxes, the government is assumed to 
have lump-sum taxes at its disposal. The section introduces the gov¬ 
ernment budget constraint and shows how finite horizons imply a role 
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for debt policy. It then derives an “index of fiscal policy” that sum¬ 
marizes the effects of current and anticipated fiscal policy on aggre¬ 
gate demand. This index has two components. The first reflects the 
effects of government spending and shows how both the level and 
expected changes in spending affect aggregate demand. The second 
captures the effects of government finance and shows how both the 
level of debt and the expected sequence of deficits affect aggregate 
demand. The importance of this second component is smaller the 
longer the horizon of agents and disappears when agents have infinite 
horizons. Section V shows the steady-state effects of fiscal policy in the 
open and closed economies; Section VI characterizes its dynamic ef- 
lects by considering two examples. The first, inspired by the current 
fiscal policy in the United States, is that of a reallocation that creates 
high deficits followed later by surpluses. The second, which leads to 
the study of optimal debt policy, studies the role of debt policy in 
smoothing aggregate consumption in the face of regular fluctuations 
in output. 


I. The Aggregate Consumption Function 

The derivation of aggregate consumption is based on two major as¬ 
sumptions. I he first specifies the probability of death and the struc¬ 
ture of population: Time is continuous. Each agent throughout his 
life faces a constant probability of death p. 1 At any instant of time, a 
large cohort, whose size is normalized to be p, is born. 2 

If the probability of death is constant, the expected remaining life 
for an agent of any age is given by f^tpe ^ dt = p ” . I shall refer to p~ 1 
as the horizon index. As p goes to zero, p~ 1 goes to infinity: agents 
have infinite horizons. 

The assumption that cohorts are large implies that, although each 
agent is uncertain about the time of death, the size of a cohort de¬ 
clines nonstochasticaliy through lime. A cohort born at time zero has 
a size, as of time /, of pe 1,1 , and the size of the population at any time t 
is J'., pe~l’ { ' =1. 

1 An alternative interpretation, suggested by Robert Barro, is to think not ol agents 
bin ol families Then p is the probability that either the family ends—i.c., that members 
ol (lie family die without children—or the current members of the family have no 
bequest motive. The assumption of a conscant p is more acceptable under this interpre¬ 
tation. How unrealistic is the assumption of a constant /;? Evidence on mortality rales 
suggests low and approximately constant probabilities from age 20 to age 40. After this, 
mottdhtv rales arc well summarized by “Gompcrty’s Law" (see Welterstrand 1081). g, 
= 1 - r (c, = BO, where y is age, g is the mortality rate, B and ('■ are positive 
constants Estimates are, e g., g r>() = 1 percent; g m = 3 percent; g H0 = 16 percent; gi» u 
= 67 percent. 

" I assume lot simplicity that there is no population growth. Introducing population 
growth in the form of larger new cohorts over time is straightforward. 
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In the absence of insurance, uncertainty about death implies that 
agents may leave unanticipated bequests although they have no be¬ 
quest motive. They may also be constrained to maintain a positive 
wealth position if they are prohibited from leaving debt to their heirs. 
Under my assumptions, private markets may, however, provide in¬ 
surance risklessly, and it is reasonable and convenient to assume that 
they do so. This motivates the second assumption: 3 There exist life 
insurance companies. Agents may contract to make (or receive) a 
payment contingent on their death. 

Because of the large number of identical agents, such contracts may 
be offered risklessly by life insurance companies. Given free entry 
and a zero profit condition, and given a probability of death p, agents 
will pay (receive) a rate/; to receive (pay) one good contingent on their 
death. 

In the absence of a bequest motive, and if negative bequests are 
prohibited, agents will contract to have all of their wealth (positive or 
negative) return to the life insurance company contingent on their 
death. Thus, if their wealth ism, they will receive pw if they do not die 
and pay w if they die. 

These two assumptions are sufficient to characterize aggregate con¬ 
sumption. For simplicity, I shall, however, make two further assump¬ 
tions. The first, which implies a simple individual consumption func¬ 
tion, is that utility is logarithmic. The second, which implies a simple 
form for aggregate human wealth, is that labor income is distributed 
equally across agents. I shall relax these two assumptions in Section 

in. 


/ndividual Consumption 


Denote by r(.i, t), y(.t. t ), w(.t, /), h(s, l) consumption, noninterest income, 
nonhuman wealth, and human wealth of an agent born at time s, as of 
time t. Let r(t) be the interest rate at time l. Under the assumption that 
instantaneous utility is logarithmic, the agent maximizes' 


E, 


log c(s , o)/ 1 ' V) dv 


e>o. 


' The rote of insurant e when there is uncertainty about time of death was studied by 
Yaari (1965). An equivalent assumption is that there exist actuarial bonds. Lenders 
lend to intermediaries. These claims are canceled by the death of the lenders. Borrow¬ 
ers borrow from intermediaries; these claims are canceled by the death of the borrow¬ 
ers. Intermediation can again be done risklessly. 

4 The assumption ol a constant probability of death implies that the objeitive func¬ 
tion (eq. [1]) does not change through time. There is therefore no issue of time incon¬ 
sistency of initial optimal progiams. 
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Given the constant probability of death p, and if the only source of 
uncertainty is about the time of death, maximizing the above is equiv¬ 
alent to maximizing 

| log c(j, v)e {t>+, ’ n '~'' ) dv. (1) 

The ef fective discount rate is therefore (8 + p). Even if 8 is equal to 
zero, agents will discount the future if p is positive. 

If an agent has wealth w(s, I) at time t, he receives r(t)w(s, I) in 
interest and piv(s, t) from the insurance company. Thus its dynamic 
budget constraint is 

= {r(<) + p]w(s, I) + y(s, t) - c(t, t). (2) 

An additional transversality condition is needed to prevent agents 
from going infinitely into debt and protecting themselves by buying 
life insurance. 1 impose a condition that is the extension of that used 
in the deterministic case/’ The solution must be such that if the agent 
is still alive at time t» 

lim I() = 0 

71—* a, 

If tins is the case, the budget constraint can be integrated to give 

| c(.i, v)e~ It{ ’ w + l’ ]ll ' , -dv = 7i»(s, l) + /((?, t), (3) 

where 

h(s, t) = | y(s. 

I he agent maximizes (1) subject to (3). This problem is very similar to 
the deterministic case, except for the presence of (8 + p) and (r + p) 
instead of 0 and r. As utility is logarithmic, the solution is simply 

r(.v, 0 = (p + 8 )[w(s, 0 + h(s, <)1- (4) 

Individual consumption depends on total individual wealth, with 
propensity (0 + p). ] he discount rate used to discount labor income is 
(r + p), the same as the rate at which nonhuman wealth accumulates. 


ttgcsi IS ltlC cxlens ‘ on 10 infinite ume of the condition proposed hy Yaari 

. hroughout the paper, when characterizing the behavior o{ consumption given 
r ant y, assume that, a! least asymptotically, r is larger than —p. This rules out 
pathological cases. 
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Aggregate Consumption 

Denote aggregate variables by uppercase letters. The relation be¬ 
tween any aggregate variable X(t) and an individual counterpart xfs, t) 
is 

X(l) = | *(s, t)pr ln ' n tls. 

Let <'.'(/), Y(t), W(t), 11(1) denote aggregate consumption, noninterest 
income, nonlmman wealth, and human wealth at time /, respectively. 
I hen liom equation (4), aggregate consumption is given by 

C(t) - (f> + 0)l//(/) + 

Aggregate <onsumption is a linear junction of aggregate human 
and nonhuman wealth. The next step is to characterize the dynamics 
of both components of aggregate wealth. 

Hit man wealth is given by 

ll(t) = j his. Ope 1 '" n <ls 

- | | v(.s, vy *' IM|i ’ l ' l,] ' l ' l <lv pe 1 '" y i. 

Changing the older ol integration gives 

11(1) = | | V(s, v)p<-l’ { ' 'Pis e ‘A" ^dv. 

I bis has a simple interpretation. The term in parentheses is labor 
income accruing at time v to agents aheady alive at time t. Human 
wealth is tints the present value of future labor income accruing to 
those currently alive, l et characterize the dynamic behavior of //(/), 
we need to specify the distribution of labor income across agents. We 
shall assume for the moment that labor income is equally distributed 
(t.e., that all agents work and have the same productivity): v(c. t') - 
) (v) for all c. f inis, all agents have the same human wealth and ll(t) is 
given by 

//(/) = | Y(v)e - 

or, in differential equation form, 

= [ r (o + p\m ~ y (o 

lint H(v)e T"luw)+/.|rfp = () 
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W(t) = | w(s, t)pe p(i !> d\. 

Differentiating with respect to time gives 

22- = w(t, t ) - pW{t) + f " ° - pe»~ n ds. 

The first term on the right is the financial wealth of newly born 
agents, which is equal to zero. The second term is the wealth of those 
who die. The third is the change in the wealth of those alive. Using 
equation (2) gives 

= r(t)W(t) + Y(l) - C(t). 

Whereas individual wealth accumulates, for those alive, at rate r + 
/), aggregate wealth accumulates at rate r. This is because the amount 
pW is a transfer, through life insurance companies, from those who 
die to those who remain alive; it is not therefore an addition to aggre¬ 
gate wealth. 

Collecting equations, dropping the time index, denoting d-klt by a 
dot gives a first characterization of aggregate consumption; 

C = (p + ())(/■/ + W) (5) 

H = (r + p)H - Y (6) 

W = rW + Y - C. (7) 

If agents have finite horizons, if p > 0, the discount rate on noninter¬ 
est income in (6) exceeds the interest rate. 1 ’ 

There is an alternative characterization of the behavior of aggre¬ 
gate consumption that will be useful. Differentiating (5) and eliminat¬ 
ing // and VV gives 

C = (r - fl)C - p{p + 0)W; (8) 

W = rW + Y - C. (9) 

If agents have infinite horizons,/? = 0 and equation (8) reduces to the 
standard equation (e.g., Hall 1978). If p > 0, the rate of change of C 
depends also on nonhuman wealth. Note that even if p is positive, 
individual consumption follows c = (r — 0)r. Thus if r = 0, individual 


'' Such a specification, allowing tor a higher discount rale for human wealth, has been 
estimated hy Hayashi (1982). His estimated coefficients, a, p, and p, are related to p, 0, 
and rby/> = p-p;(j = o,-j J . + p; r = p His estimates (table 1, X = 0) imply at annual 
1 ales p = .10; r = .03; 8 = -.03. 
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consumption will be constant but aggregate consumption will in gen¬ 
eral vary. 


II. Dynamics and Steady State 

I consider in turn the cases of a closed and an open economy. 


The Open Economy 

In the open economy, the interest rate is the world interest rate, r, 
which is given and at which consumers can freely borrow and lend. 
For simplicity, there is no capital and the only assets are therefore the 
net holdings of foreign assets, denoted F. Noninterest income is exog¬ 
enous and denoted <*>. Using equations (8) and (9), equilibrium is 
characterized in this case by 

C = (r - 0)f: - p(p + 0)F; (10) 

/ = rF + w - C. (11) 

Note that one can also think of this system as giving the partial equi¬ 
librium dynamics of consumption, savings, and wealth, given rand to, 
in a closed economy. 

This system is linear in ('■ and F. It is suddlepoint stable if r is less 
than 0 + p. If 1 were larger than 0 + p, individual consumption would 
increase at a rate larger than p, that is, larger than the rate of death; 
aggregate consumption would therefore increase forever. If we ex¬ 
clude this case, we have a well-defined steady state, with associated 
values for C and F. This result differs sharply from the infinite hori¬ 
zon case where a steady-state value of C exists only if 1 = 0 and where 
in this case the value of F is indeterminate (or more precisely, de¬ 
pends on the path of adjustment). 

I now construct the phase diagram. The slope of C = 0 is positive if 
r > 0, negative if r < 0. The slope of/' = 0 is positive. Both cases are 
represented in figures 1« and 1/. in both cases, the stable arm is 
upward sloping. 

What are the characteristics of this steady state? If 1 = 0, the value 
of F is zero. Agents have flat labor income and consumption profiles; 
they do not save or dissave. If r is greater than 0, individual consump¬ 
tion is increasing, agents are accumulating over their life, and the 
level of foreign assets is positive. If r is smaller than 0, agents are 
decumulating and, as a result, the level of foreign assets is negative. 
The country is a net debtor in steady state. 

An increase in r increases the level of foreign assets. An increase in 



FINITE HORIZONS 


231 



= cu+rF) 


F 


p pushes the level of foreign assets toward zero, reducing it it positive 
and increasing it if negative: shorter horizons imply smaller aggregate 
accumulation or decumulation. 

An alternative way of looking at aggregate behavior is to return to 
the specification giving aggregate consumption as a function of 
wealth and to derive an aggregate savings function. = (f> + 0)[w/(r 
+ p) + /•'], whereas income is w + rt\ so that i’sf — C = ](r - 0)/(r 
+ />)]w + (r-/)- 0)F. As r is less than 0 + savings is a decreas¬ 
ing function of wealth F? The effect of noninterest income u> is am¬ 
biguous and depends on (r - 0). in steady state, savings must be 
equal to zero. If r is equal to 0, this implies a zero equilibrium level of 
foreign assets. If r is greater than 9, the equilibrium level is positive; if 
r is less than 0, the equilibrium level is negative. 

1 This savings funs lion is similar 10 ihal used by Dornbusrh and Fisihcr (I9H0) in 
dieir study of current at count dynamics. 
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Ik. 2 


T)u• Closed Economy 

in the closed economy, at ancl 1 are no Ion get given but are deter¬ 
mined instead by capital accumulation. There are two factors of pro¬ 
duction. capital K and labor; from above, the si/e of the population is 
etjual to the si/e ol the labor force and ecpial to unity. Let F(K, 1) be 
the constant teturns to scale production function and 5 be the depre¬ 
ciation rate. Define I'(K) = /(A, 1) - 5 k. 

Nonhuman wealth is equal to A. Noninterest income is labor in¬ 
come cu(A'). I he interest rate is the net maiginal product of capital, 
t(A'), wfiiclt may he positive or negative. Using equations (8) and (9) 
gives 

C -= [r(A') - 9|C: - p(p + 0)A (12) 

A = F(K) - C. (13) 

Figure 2 characterizes tfie phase diagram associated witfi the sys¬ 
tem. Let us define two values of A', A* such that r(A*) = 0, and A** 
such that r(A'**) = 0 + p. The locus C = 0 is upward sloping, going 
through the origin and asymptotically reaching A’*. 'The locus A = 0 
traces the production function. 

'The equilibrium is unique with a saddlepoint structure. The stable 
arm SS is upward sloping. Any other trajectory can be shown to imply 
a negative level of C or A in finite time, and thus the stable arm is the 
only acceptable trajectory: given A, C is uniquely determined. 

The steady-state interest r is between 0 and 0 + p: r a 0 follows 
from A < A*; r < 0 + p is shown by contradiction. Suppose f s 0 + p 
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so that (r - 0)C 2: pC. From equation (12) and C = 0, (r - 0)C = 
p(p + Q)K, so that (p_+ 0)A_s C. From equation (13) and K = 0, C = 
F(K), so that (p + 0)K i /•'(K). As by assumption r a 0 + p,rK > F(K ), 
which is impossible. 

If a = 0, so that agents have infinite horizons, r = 0: the standard 
"modified golden rule” result obtains. When p is positive, however, r is 
larger than 0. The reason is clear if we return to the individual con¬ 
sumption equation c = (r — 0)e. In a steady state, labor income is 
constant through life. In order to generate positive aggregate capital, 
agents must be saving initially and consumption must be increasing. 
Thus r must be larger than 0. 

Furthermore, r is an increasing function of p. This follows from the 
phase diagram: an increase in p shifts C = 0 to the left, decreasing A". 
The shorter the horizon, and the higher the interest rate, the lower 
the level of steady-state capital. I shall show, however, in the next 
section that declining labor income during life has the opposite effect. 

III. Two Extensions 

1 consider now two extensions to the original model. The first focuses 
on the effects of the elasticity of substitution of consumption, the 
second on the effects of declining labor income through life. 


The Role of the Elasticity oj Substitution 
I now consider the class of isoelastic utility functions: 


u(c) = 


A -IT 


CT # 1, 


1 “ C ’ 

= log c, cr = 1. 

The elasticity of substitution is given by tx~ *. All other assumptions 
are unchanged. 

Following the same steps as in Section I, the first-order condition to 
the individual maximization problem is 


= T'[r(() - 0]cM). 


(H) 


Solving for c(s, v), v ^ t as a function of c(s, t ), replacing in the 
budget constraint given by equation (3), and solving for c(s, t) gives 

c(s,t) = [A(f)]~‘ [w(s,t) + h(s,t)] 


4 
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As before, consumption is a linear function of total wealth. I’he pro¬ 
pensity to consume is now a function of the sequence of future inter¬ 
est rates; it is, however, not a function of age and is therefore the 
same for all agents. 

Following the same steps as in Section I gives the dynamic behavior 
of aggregate consumption; 

C = A '(// + W) 

A = - 1 - a '[(I - a)(r + p) — (8 + p)] A, 

or, equivalently, 

C = ct '(r - 0)C - pk 'W 

A = - 1 - <r '((1 - cr)(t + p) - (0 + p)] A, 

where the behavior of A is characterised by a differential equation 
(and a transversality condition I have not written) and the dynamic 
behavior of // and IV' is the same as in Section 1. 

1 shall limit my analysis to the effects of rr on steady-state capital in a 
closed economy. The steady state is characterized by 

C = a 'lr(A') - e\C - pA 'K =0 (15) 

K ■= F(K) - C = 0 (16) 

A = -1 - cr '((l - cr)(>(A) + p] - (0 + /;)}A - (). (17) 

Figure 3 < ha tat terizcs the steady stale graphically. The first locus is 
C = A = 0, or 


C = pK 


{(cr - !)| r(A) + p] + (8 + p)} 

7(K) - 0 


For it > 1, this lotus starts at the origin, is upward sloping, and 
approac hes A* asymptotically, where A* is again such that r{K*) = 0. 
Fot tr < 1, the locus is initially downward sloping, then upward slop¬ 
ing. It also approaches A* asymptotically. These iw r o cases are drawn 
in figures 3a and 3b. The second locus, A ~ l), traces the net produc¬ 
tion function. 

The steady-state capital stock is smaller than A'*; the net marginal 
product r is thus greater than 8. An increase in cr shifts the C = 0. 
locus to the left, decreasing steady-state capital. Thus, the lower the 
elasticity of substitution, the lower steady-slate capital. Intuition for 
this result is obtained by examining equation (14), giving the behavior 
of individual consumption. As in Section II, a positive aggregate capi¬ 
tal requires savings initially in life. As labor income is flat, this re¬ 
quires initially low and increasing consumption. The lower a the 
larger the interest rate needed ter twist the consumption path. Equiva- 
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3a : <r>l Low Elosticity 



3b : o-< I High Elasticity 



lenity, (he lower o', the lower initial individual savings given ihe 
interest rate. 


I lie El leek of Dei lining Labor Income * 

What I want to capture here are the ef fects of “saving for retirement” 
on aggregate capital accumulation. Introducing retirement, that is, 
zero labor income after some given length of life, is not analytically 
convenient. I assume instead that labor income declines with age at 
rate a. More precisely, 1 assume 


y(s, v) = aY(v)e al '-"\ 


(18) 
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where a is a constant to lie determined. The share of labor income 
received by an agent is an exponentially decreasing f unction of age.* 
The value of a is determined by the condition that 


Y(v) 


v(.», v)j)c , ’ i ' v) di 


= | + ■ "\i.S = — a j Y(l>) 

/> + a 


Note that the case where a is positive is well defined only it p is strictly 
positive. If p = 0, agents ate infinitely long lived. Individual labor 
income must be the same, up to a constant, as aggregate labor income; 
it cannot be a decreasing function of age. 

The derivation of individual consumption, assuming logarithmic 
utility, is identical to that of Section 1. The derivation of aggregate 
consumption is also the same, except for aggregate human wealth, 
('liven (18), individual human wealth is given bv 


h(.\, t) 


Y(x’)r- f ' ' '<•*>' W»dv |. 


Note that the let m in braces is the same for all agents. Thus aggregate 
human wealth is 


1/(1 ) = j h{s, t)pe '\h 
= [ Y(v)r !n «^W 


The effect of declining labor income is to increase the discount rate 
on future aggregate labor income. This is because agents currently 
alive will receive, even if still alive in the future, a smaller and smaller 
share of total income. 

Collecting equations gives the following: 


C = (p + 0)(// + W) 

(19) 

II = (; + P + a)II - Y 

(20) 

W = rW + Y - C. 

(21) 


K This lormah/adon cart be extended to accommodate more complex paths of in¬ 
come. If one uants to capture the fact tfiat fabot income initially increases and then 
decreases with age. this can be done by assuming that the share is the sum of two 
negativeex|H>nentials, i e., isequal locijc" 11 ’ ° + era’"' 1 ' "'.txj.a^, a, < ().a 2 > 0,a t ai + 

U'M-t > 11 . 
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K*| r(K*) = B 
K | r( f< ) = 9 - a 

Fig. 4 


The equation for human wealth is given in differential form. The 
only ef fect of a is to increase further the discount rale on f uture labor 
income above the interest rate. 

The alternative characterization of the dynamics, obtained by dif¬ 
ferentiating (14) and eliminating H and W using (20) and (21), is 

C = (r 4- a - 0)C — (p + a)(p + 0)W, 

W = rW + Y - C. 

I now turn to the dynamics and steady slate of the closed economy. 
I he dynamics are characterized by 

C = [r(K) + a - 6]C - {p + a){p + Q)K, (22) 

K = F(K) - C. (23) 

1 he phase diagram associated with (22) and (23) is drawn in figure 4. 
I he C = 0 locus goes through the origin, is convex, and reaches K 
asymptotically, where r(K) = 0 - a. Note that r(K) may be positive or 
negative: figure 4 is drawn so that r(K) is negative. The A' = 0 locus 
traces the net production function. The equilibrium is saddlepoint 
stable. The stable arm SS is upward sloping. 

What are the characteristics of the steady state? The interest rate f is 
smaller than Q + p, and larger than 0 — a. 

f he proposition that r is larger than 0 - a follows from K < K. 
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That 7 is less than 0 + p is again proven by contradiction. Suppose 
that r 2 0 + p. Then (7 — 0 + a)C 2r (p + aEquation (22) and C = 
0 imply (7 — 0 + a)C = (/> + a)(p + 0)A, so that (p 4- 0)A' S; C. 
Equation (23) and A = 0 imply C = F(K), so that (p + 0)A a F(K). As 
by assumption r S 0 + p, we get 7A 4- f'(A'), which is impossible. 

It is quite possible lor 7 to be negative. As the golden rule in this 
case is r — 0—there is no population growth—the capital stock may 
exceed the golden rule and the economy may be, as in Diamond, 
dynamically inefficient. 

Furthermore, an increase in ct—that is, a more sharply declining 
labor income path—increases steady-state capital and decreases 7. To 
see this, consider dC/da\,' » l( , = * = K(p + 0)(7 — 0 — p)/(T — 0 + a). 

As 7 < 0 + p and 7 > 0 - a, dC/dct\ ( ‘^ tl A < 0. In the neighborhood 
of steady state, an increase in a shifts the G = 0 locus to the right, 
increasing steady-state capital. The reason for this result is clear: if 
labor income accrues relatively early in life, it will lead to mote savings 
early in life and thus higher aggregate wealth. 

To summarize, the effect of finite horizons per se is to decrease 
capital accumulation. The effect of declining labor income on saving 
foi retirement is to increase it. however. The net effect is ambiguous 
and the steady state can be inefficient. 


IV. Effects of Taxes on Aggregate Demand 

The Government Budget Constraint 

1 now introduce a government that spends on goods that do not affect 
the marginal utility of private consumption and finances spending 
eithei by lump-sum taxes or by debt.' 1 Its civ n.tmit budget constraint is 
I) - 1 1) T G - 7 , where l) is debt, (. is spending, and 7 is taxes. 1 shall 
refer to T - G as the surplus, or deficit as the case may be, and to D as 
the change in debt. This is only a semantic convention. The govern¬ 
ment is also required to satisfy the transversality condition: 10 

lim - 0. 


This condition, together with the dynamic budget constraint, is equiv¬ 
alent to the statement that the level of debt is equal to the present 
discounted value of future surpluses: 


I), + G,e { " ,lv ds 


7> l '"' ,v ds. 


(24) 


! * The assumption that the government can use lump-sum taxation is made to locus 
on the cfletts of futile horizons. As is well undetstood, it lump-sum taxation is not 
available, there Is a role for debt poltc) even if agents have infinite horizons. 

10 In whai follows, I assume that r is nonnegalive. al least asymptotical)). I therefore 
do not look al fiscal policy in the case of dynamic inefficiency. 
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The presence of taxes modifies slightly the aggregate consumption 
function of Section 1, which becomes 



= ip + «)(//, + W,y, w, = D, + K, 

(25) 

//, 

= f Y^ S!( ’' + f ,,tv ds - [ 7>-- p( '" , /' , '' , 7L 

(26) 

w, 

11 

-c 

+ 

+ 

1 

(27) 


Financial wealth, W, now includes government debt, D, and other 
assets, K. Human wealth is the present discounted value of noninter¬ 
est income minus taxes, discounted at rate (r + p). 

Effects of a Reallocation of Taxes 

Consider a decrease in taxes at time t associated with an increase at 
time / + t. Given the government budget constraint (24), the level of 
debt D h and an unchanged path of G, these changes must satisfy 

(IT, 4 T = -e +;/ ' V 'V/;. 

l he effect on the consumer at lime t is given by the effect on 
human wealth. Equation (26) implies an effect of 

-ill) - + 

or, using the government budget constraint, an effect of -dT,(\ - 
e~ ,n ). Thus, unless p = 0, a decrease in taxes todav increases human 
wealth and consumption. The longer taxes are deferred, the larger 
the effect. This effect of a reallocation of taxes comes f ront the differ¬ 
ent discount rates in the government budget constraint and in the 
definition of human wealth. This in turn reflects the fact that taxes 
are partly shifted to future generations: (1 - e ,n ) is simply the prob¬ 
ability that someone currently alive will not have to pay the future 
increase in taxes. 

An Index of Fiscal Stance 

Fiscal policy—that is, the sequence of current and anticipated taxes, 
spending, and debt—affects aggregate demand in three ways. Debt is 
part of wealth and affects consumption; the sequence of taxes affects 
human wealth and tints consumption. The level of government 
spending affects aggregate demand directly. It is useful, both concep¬ 
tually and technically, to summarize these ef fects by an index of fiscal 
policy Let g denote this index, so that, collecting all the terms in 
aggregate demand that depend directly on fiscal policy, 

l), - | 7 > J ' < " + G,. 


St = (f> + 0 ) 
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This index tan he rewritten as 
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ft = (; - (p ■+ 0)| 


cL\ 


+ (p + fl) 


D, + ((.', - 7\)e _J?t ’' +/ '> rf,, <is 


(28) 


(he first line gives the effects of spending on aggregate demand if it 
is finant ed exclusively hy contcmporaneous taxes. Eff ects of spending 
are not the subject of this paper. Note, however, that if r = 0, a 
constant level of spending has no effect on aggregate demand. The 
second line gives the effects of financing. Note first that if p = 0, the 
government budget constraint (24) implies that this line is identically 
equal to zero; financing is irrelevant. If p is positive, this is not the 
case, and if I), is positive, this term is likely to be positive. The second 
line in effect measures tlte degree to which debt is net wealth and the 
degree to which it is offset by anticipated future surpluses. It makes 
clear that the degree to which debt is wealth depends on the whole 
sequence of anticipated surpluses (deficits). 

How does this index evolve over time for a given fiscal policy? 
Consider a policy characterized by large deficits initially, followed 
later, as debt accumulates, by surpluses so as to satisfy the intertem- 
poial government budget constraint. Does the initial increase in g 
disappear as surpluses appear, or does g remain positive as debt ac¬ 
cumulates? An example, inspired by the current U.S. fiscal policy, will 
suggest the answer. The specific fisc al policy we consider is character¬ 
ized bv 


i), = rl), - T,\ T, = p/>, - Z; I)„ = I). (29) 


(Government spending is equal to zero for simplicity. Taxes are an 
mu easing function of the level of debt. A necessary and sufficient 
condition lor the government transversality condition to be satisfied is 
that P =2 i, so that an inc lease in debt reduces D. The interest rate r is 
assumed constant. 

Consider the effect of an increase at time to of Z from zero, which 
implies initially a sequence of deficits. As debt accumulates, taxes 
increase, surpluses appear eventually, and debt converges to a new 
steady -state value, D*. where EC = Z/(P - t). The smaller (p - r), the 
longer the sequence of deficits, and the higher the steady-stale level of 
debt. 

Solving (29) for the path of D and T and replacing in (28) gives 


(p + 9)p 


!(r + p) 


- 1 


- O + p)-'e 


(30) 


ft = 



finite horizons 


241 


so that in particular 


go = 

= 


' (/> + »)/> ' 

. (t + />)(P + p) , 

+ fl)/> • 

(r + />)(p - r) 


Z 

Z = 


(/> + 8)/> ' 
r + p 


D-x, > go. 


The value of g at timet () , g () , depends on the sequence of anticipated 
deficits and thus on 0. The steady-state value of g, is easy to 
understand. In steady state, taxes required to pay interest on the debt 
are equal to Their present value in human wealth is — [r/(r + 
/j)]D», and thus the net effect of debt is (p + 0){/)*-[r/(r + />)]D*}, 
which is equal to g*. 

The index therefore increases from go to g*: although the budget 
goes from deficit to surplus, the increase in debt dominates, leading to 
an increase in g. 

I now turn to the general equilibrium effects of fiscal policy. 


V. Steady-State Effects of Fiscal Policy 

My focus in this section is on the steady-state effects of debt on the 
level of consumption and holdings of other assets. These ef fects are 
best understood by considering the dynamic effects of the following 
raihei artificial policy experiment. Starting from steady state, the gov¬ 
ernment, at some time issues and distributes new additional debt 
while increasing taxes to pay for the additional interest payments. 
The level of debt remains constant at its new level forever. If the 
merest rate changes over time, taxes are adjusted to always cover 
merest payments. 1 consider first the open economy and then the 
dosed economy. 


Die Open Economy 

In the presence of fiscal policy the equations of motion are for the 
>pett economy: 

c ‘ <f + Kttt + D + f ) 

F = rF + u> — C — G 
b = rD + G - T. 

f he fiscal policy I consider is such that D , G, and T are constant 
and thus D is equal to zero), except at time / () when D and T increase 
permanently. Their increases satisfy dT () = rdD n . 
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One tan solve for steady-state C and F as f unctions of D and G (T is 
determined implicitly by the government budget constraint): 

F = (p + 0 - r) \r + p) '[(r - 0)(w - G) - (p + 6)pD]; 

C = u> — G + rF. 


The steady-state level of foreign assets is a decreasing function of 
the level of government debt and so is steady-state consumption. 
More precisely dF/dl) = -(p + 0 - r) \r + p)~ '(p + B)p §F -1 as r 
§ 0 . 

Government debt displaces foreign assets in agents’ wealth. The 
displacement is one for one if r = 0 (note that in this case F = -D), 
but it may be much larger if r is larger than 0. These results differ 
sharply from the infinite horizon case where the level of debt has no 
effect on the steady-state level of F. By choosing its level of debt, the 
government can choose any level of steady-state consumption it de¬ 
sires. 

These effects are easier to understand if one examines the process 
of adjustment and the savings function. Consumption is given by (p 
+ 0)[(o> - T)t(r + p) + l) +• F\ and income by a> - T + r(D + /•'), so 
that savings are given by 


S = 




Thus the change in fiscal policy at time / (1 implies a decrease in savings 
of dS 0 ldl) n = ~ pip + 0)/(r + p). 

The iru tease in taxes and debt does not affect income but leads 
agents to feel wealthier by an amount [p/(r + p)\dD lt . T his leads them 
to increase consumption and dissave and to decumulate foreign as¬ 
sets. This decumulation proceeds until a lower level of foreign assets 
has been reached and savings are again equal to zero. By then, both 
foreign assets and consumption are lower. 


The Glased lu onamy 

The equations of motion are in this case 

C = \r(K) - «]C - p{p + 6)(/> + A'); 

A = F(K) - C - G; 

D = r(K)D + G - T. 

I consider the same fiscal policy as before, that is, an increase of /. 
and 7 at time < () that satisfies dT {) = r(A ( ,)f/Do- As the interest rate wil 
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c = o 



Fiu 5 


now vary after time taxes implicitly vary after time /„ to cover 
interest payments on the new constant level of debt. 

I characterize the steady state graphically in figure 5. Let A* be 
sue h that 1 (A*) = 0. As long nsD is larger than (- A'*) the C = 0 locus 
goes through the origin and reaches A* asymptotically from the left. 
If I) is negative and less than (-A*), the C = 0 locus reaches A* 
asymptotically from the right. The A = 0 locus is drawn for a value of 
(> equal to zero. 

If (', equals zero and debt is larger than (-A*), theie is a unique 
saddle-point stable equilibrium. If G is positive, there might be zero or 
two equilibria. The same is true if debt is less than ( - A*). The cases 
of zero and two equilibria will not be considered further. 

I he effect of an increase in government debt is to shift the C = 0 
locus to the left and thus to decrease the steady-state levels of capital 
and consumption. The government can again choose any level of 
capital by an appropriate choice of debt. If, for example, the govern¬ 
ment uses 0 + p as the social discount rate, it may achieve the desired 
level of capital A**, such that r(K**) = 0 + p, by issuing a positive 
amount of debt. If instead it uses 0 as the social discount rate, it must 
issue a negative amount of debt, in this case precisely an amount equal 

I he dynamics ot adjustment to the increase in debt are qualitatively 
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similar to those of (he open economy. The increase in debt and taxes 
creates an initial wealth effect on consumption, leading to capital 
dec mutilation. In the new steady state, capital and consumption are 
lower. 

I now turn to the dynamic effects of more realistic fiscal policies. 


VI. Dynamic Effects of Specific Fiscal Policies 

Internal and External Defuits 

The first policy 1 consider is one that resembles the current U.S. 
policy, in which debt is initially low and deficits are large and in which, 
as debt act mutilates, the government slowly moves from deficits to 
surpluses until debt has stabilized at a new higher level. 1 limit my 
analysts to the case of an open economy: the locus is therefore not on 
the interest rate, which is given, but on the relation between govern¬ 
ment and nit rent account deficits. 

I he fiscal policy is the same as that described in Section IV' and is 
implemented unanticipatedly at time / () . It is convenient in this case to 
use the index of fiscal policy defined in Section IV and derived for 
this particular policy, also in Section IV. The equations of motion can 
be written as 


“«/' + 8) (tt7 + '1 + 

F — >E + u> - ( .; 


k 


(r - P)jf + 


(p + »)/' 

' + P 


Z 




(p+J)p 


(r + /t)0 + p). 


As government spending is by assumption equal to zero, fiscal pol¬ 
icy has an effect only through consumption. The behavior of g was 
characterized in equation (SO) and is written here equivalently in dif¬ 
ferential equation form. At time /,,, g increases from zero logo. 

The dynamics of E and g are characterized in figure 6. The stability 
condition is that > be less than 0 + p. If it is satisfied, the locus E = 0 is 
downward sloping, with slope (r — p — 0)The g — 0 locus is 
vertical. 

The steady state, prior to the change in fiscal policy, is point E. At f () , 
the locus g = 0 shifts to the right, and g jumps from 0 to g () . The 
system jumps from E to A. Over time, the economy converges from A 
to E'. 
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The effect of fiscal policy is thus of a decumulation of foreign assets 
as government debt accumulates. The rates of foreign asset accumu¬ 
lation and government debt accumulation are not, however, related 
in any simple way: the rate of foreign asset decumulation depends not 
only on the current deficit but also on the entire sequence of deficits 
(surpluses). For example, if r - 0, we have seen in the previous sec¬ 
tion that in the long run the decrease in foreign assets is equal to the 
mu ease in government debt. At t - 0, however, the effect of the rate 
of change of government debt on the rate of change in foreign assets 
is given by dFklD\, u = - g„/Z = + p). Thus the short-run effect 

is less than one for one. It tends to one as p tends to infinity, as the 
horizon of agents shortens. As (3 reaches its lower bound r (i.e., the 
lowest value consistent with satisfaction of the government transver- 
sality condition), it tends to pl(r + p). 


Output Clydes and the Role of Debt Policy 

This second example examines the ef fects of regular fluctuations in 
output on the decentralized economy and the role of debt in such a 
case. 

f he economy is open, and for simplicity 1 assume r = 0. If I define 
total financial wealth W as the sum of foreign assets and government 
debt, the equations of motion can be written as 

C = -p(p + 0)W 

W = rW + w - C - T 

(D = rD + G - T). 
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Now suppose that oi, follows: w, = 'l' 4- sin l \ 'J' > 0. In the absence 
of fiscal policy (T = 1) = G = 0), what will the behavior of C and W 
be? 

Some algebra yields 

C, — sin / + a-> cos t + 

IV, = /> 1 sin / + h-, cos t, 

where 

«i = A pip V 0){i + (>(p + 0)| = pip + 0) 
d'i = - Ap(p + 0)0 = pip + 0)0 

b t - A0 =0 

h-j, ~ — A[ I + p(p +0)) = - 1 

and 

A s {e 2 + 11 + pip + e)f} '• 

file approximations to a>, b ,, b 2 hold if p and 0 are not far from 
zero. 

If p = 0, agents completely smooth out consumption. Aggregate 
consumption is constant. Agents accumulate foreign assets when ca is 
high and decumulate when u) is low. 

If p > 0, aggregate consumption is cyclical. As r — 0. eac h agent still 
has a flat consumption path. The newly born do not, however, have 
the same level of consumption as those who die. 110= 0, aggregate 
consumption moves in phase with income, but by less. 

These movements in aggregate consumption suggest a role for 
fiscal policy. As 1 = 0, the consumption of each individual is constant 
throughout life. Different cohorts, however, have different levels of 
consumption. Thus, if the social welfare function is concave in indi¬ 
vidual utilities, it is desirable to smooth consumption across cohorts. 
Fiscal policy can achieve constant aggregate consumption over time. 
The equations of motion above show how this can be done: T must 
simply equal sin t. If all deviations of u> from its mean 'F are taxed, 
consumption and wealth arc constant. Government debt and foreign 
assets follow symmetric but opposite paths. As government debt in¬ 
creases, for example, it displaces foreign assets in the agents’ port¬ 
folios one for one. 

'This change in portfolio composition has no effect, in an open 
economy, on either u) or r. This would not be the case if the same debt 
policy were pursued in a closed economy: variations in the capital 
stock would affect both u> and r. Characterization of optimal policy 
would be substantially more difficult. 
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The purpose of this paper was to characterize rigorously the effects of 
intertemporal reallocations of taxes when agents have finite horizons. 
To that end, many assumptions, such as the existence of lump-sum 
taxes or a constant employment level, were made that need to he 
removed to obtain a more realistic characterization of the effects of 
debt and deficits. The aggregate consumption function developed 
here seems well adapted to the task. The index of fiscal policy should 
also prove useful both conceptually and empirically. 
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The Tiebout Model: Bring Back the 
Entrepreneurs 


J. Vernon Henderson 

Hntwu I'rnonstt's 


Sevetal totem papers hi the literature have reformulated the nature 
of equilibrium in Tiebout models by assuming an exogenous nuni- 
bei ol < ommuniiies, inflexible tommunitv bmmdai iex, and in partic- 
ular inailive landowners and developers. This paper argues that 
these assumptions are unvvairanted and texult in indeterminate so¬ 
lutions and itHouetl analyses. Determinate long-tun solutions te- 
quiie etpnlthnunt in inteuominunitv land tnaikeis, which in turn 
tequire giving landowners and/oi entrepreneurial developers an 
.u live tole m the models The tole ol politics m these models and its 
juxtaposition with entropieneunal activities aie also analyzed 


Recent papetx sue It as those by F.pple and Zelenii/ (1981). Bucovetsky 
(1982), and Yingcr(1982)on the Tiebout model dillei fundamentally 
I tom earlier work by Elliekson (1970, 1971) and Hamilton (1975, 
1970) in their formulation of long-run equilibrium in a Tiebout 
model. Tbe.se recent papers move beyond the central focus of earlier 
wotk on population stratification to ask questions about the role of 
local politics, the role of entrenched interests, and the impact of exter¬ 
nal equalizing grants in the model. At the same time these papers 
have tillered the formulation of what constitutes long-run equilib¬ 
rium, particularly in intercommunity land markets. We will argue 
that the new formulation is incorrect and that a proper formulation 
of the problem will alter the answers to some of the questions asked. 

In the earlier work, the process of population stratification genet- 
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ally occurs in a setting where communities form on, say, a Hat fea¬ 
tureless plain, and each community perceives that land is available to 
it in infinitely elastic supply at a fixed price. First, this perception can 
mean either that total urban demand for land accounts for only a 
fraction of overall land use in competition with agricultural or recre¬ 
ational uses, or, if all local land is in urban use, that any one commu¬ 
nity accounts for only a tiny fraction of total urban demand. Second, 
it implies that community boundaries are flexible in the long run and 
respond so that land of uniform quality has the same price across 
communities. Third, there is a general presumption that the number 
of communities that form is endogenous. While we will suggest that 
maintaining endogeneity of community numbers and boundaries is 
the best way to formulate the problem, we will argue below that flexi¬ 
bility of boundaries and numbers is not necessarily critical, providing 
landowners and developers have an active role in the model. 

In contrast to these notions of flexible community boundaries and 
numbers of communities, the recent papers presume a fixed number 
of communities with fixed land areas. Thus, there is no land use 
adjustment through variation in community boundaries and numbers 
to equalize land prices across communities. Moreover, there is no 
other adjustment mechanism for land uses across the fixed number of 
communities to equalize marginal products of land. T his paper will 
argue that the solutions in the recent papers are certainly not long- 
run solutions, but might deal with what could be termed temporary 
equilibria in an unspecified dynamic context. 

I will show why it is desirable to reincorporate equilibrium in inter¬ 
community land markets in long-run versions of the Tiebout model. 
T his will generally involve giving back landowners and developers an 
active entrepreneurial role in the model. Hence the title of this paper. 
1 his demonstration will be done in the context of examining a ques¬ 
tion raised by Epple and Zelenitz (1981): “Does Tiebout need poli¬ 
tics.-'” 1 will show that T iebout does not need politics in a properly 
formulated long-run model. As an aside, it should be noted that 
Fpple and Zelenitz do not focus on the specific question they raise. 
I hey actually ask, if T iebout has politics, whether population move¬ 
ments will eliminate the negative impact of “bad” politics. While Ep¬ 
ple and Zelenitz answer this second question correctly with a no in 
their context, I will argue that in a long-run solution, population 
movements and land use adjustment will also eliminate the negative 
impact of bad politics. 

I he formal model in this paper as in the papers referenced above is 
a single-period” or long-run equilibrium model where capital (hous¬ 
ing structures) and lot sizes are perfectly malleable, so that compara¬ 
tive statics allows the world to be dissolved and costlessly restructured 
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between solutions. In the concluding section 1 will argue that formally 
introducing dynamic considerations can alter the issues to be consid¬ 
ered, in ways not generally recognized in the literature. The introduc¬ 
tion of nonmalleability of lot sizes and structures and transactions 
costs of moving for existing residents means that the current decisions 
and behavior of residents will be affected by what they expect future 
public-good tax-development policies in the community will be, which 
introduces the issue of “time inconsistency" (e.g., Kydland and Pres¬ 
cott 1977; Fischer 1980). 

Before turning to the conceptual issues analyzed in Sections I and 
If, it would be useful to ask whether it is more reasonable to assume a 
world where community numbers and boundaries are fixed or one 
where they are variahle. The basic (acts for the United States may 
surprise people because they indicate a degree of fluidity far beyond 
what seems to be generally assumed. In table 1 1 look at how the 
number of urban places (local political units) over 2,500 has varied 
over time. Under A, I compare decade rates of growth in population 
of all urban places over 2,500 anti growth rates in numbers of urban 
places ovet 2,500. Because data prior to 1950 are limited, 1 focus on 
urban places over 2,500. However, for 1950-60 and 1960-70, the 
growth rates for all urban plates (including those under 2,500) are 
the same as the tines in the A columns, so that, overall, the growth 
rates in table 1 relied new incorporations (not simply movements ol 
tiny urban plates into larger urban place categories). While the 
growth rates of numbers of places are generally less than for popula¬ 
tion, since 1930 the two growth rates are very close. Moreover, this 
near equality in growth rates holds by class of city. This is illustrated 
in the B columns lor the decade with the highest overall growth rate, 
where J compare growth rates in population and numbers of places 
by city size category. 

Perhaps even more surprising than the high growth rates of num¬ 
bers of communities may be the extent of changes in political bound¬ 
aries through annexation (and even detachments). These numbers 
are given in table 2. Two basic facts emerge. On average, the growth 
rates for land areas far exceed the growth rates for population in 
existing urban places. Second, even in relatively short periods of time 
(e g., 1970—76), most cities experience growth through annexations. 
Even for any 1 year in the period 1970—76 about 30 percent of all 
cities had significant annexations. The only exception appears to be 
cities in the northeastern United States. While the focus is on annexa¬ 
tions, detachments also occur, although at a much lower rate (detach¬ 
ments are about 1.4 percent of annexations). 

Although I obviously have not shown that community numbers and 
sizes adjust to equalize prices for land of uniform quality, it is cleat 
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that, foi whatever reasons, llie numbers and land areas of local polit¬ 
ical units are highly flexible over time. Such an assumption seems not 
simply warranted but central to any analysis of the LJ.S. situation. 


I. Long-Run Equilibrium: Tiebout Does Not 
Need Politics 

1 start by examining three possible models ot <«/meornmiinity equilib¬ 
rium and then go on to determine which of these formulations is 
consistent with long-run equilibrium across communities. A major 
point will be that whatever the politics involved, entrepreneurial de¬ 
velopers are required to play an active role in the model to obtain 
solutions consistent with the usual notions of what constitutes long- 
run equilibrium. 

In presenting the three models of intracomnuinity equilibrium, 1 
make a number of standard simplifying assumptions, corresponding 
to those in, for examjvle, Epple and Z.elenitz (1981). These are not 
critical to my analyses and in many cases simply represent a need m 
limit the number of alternative solutions we consider. First, nonland 
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income, nonhousing production, and the cost of capital are exoge¬ 
nous to the problem. Second, within a community people have identi¬ 
cal incomes and tastes, reflecting the Tiebout forces encouraging 
stratification. Third, local public services are generally financed solely 
by a property tax on housing. It is assumed that this tax distorts 
housing consumption or that the type of "perfect zoning” in Hamil¬ 
ton (1975) is not feasible. This paper is not taking a stance on this 
issue but simply choosing the more common assumption. Finally, I 
assume in all cases that the number of communities is very large, so 
that communities do not see their actions affecting prices in other 
communities. 

Before proceeding, it is useful to clarify the possible activities of 
agents in the economy, although their activities can vary according to 
the particular model of intracommunity equilibrium. First, there are 
land developers who do not live in the community and own the land, 
at least as an initial endowment. Developers may or may not be re¬ 
sponsible for providing public services. Second, there are residents 
who either rent or buy housing in the community. Since in Section I 
of the paper we are using the traditional single-period model to 
characterize long-run equilibrium, renting versus owner occupancy 
are indistinguishable modes. In Section II, in a dynamic context, they 
will become critically different. Third, if there is politics with demo- 
c ratic (costless) voting, the only agents who vote are residents, 
whether they rent or own. With politics there is a costless government 
that provides public services. Finally, there can be faceless contractors 
who actually construct the housing with borrowed or purchased capi¬ 
tal and land and then rent or sell it to residents. Alternatively, resi¬ 
dents may directly build their own housing out of land and capital, 
borrowed or purchased from absentee land developers and capital 
owners. Or, as another alternative, land developers may construct 
and sell or rent out the housing. These three cases are equivalent in a 
long-run model. 


/I Inlracommunity Equilibria 

In presenting the three models of intracommunity equilibrium we 
start by assuming that community land area is fixed. Then in Section 
Id we examine the impact of allowing community land area to vary on 
intercommunity equilibria and corresponding intracommunity 
equilibria. Under our various assumptions, intracommunity equilib¬ 
rium deals only with equilibrium in local housing markets and in 
provision oi local public services. For the housing market, total hous¬ 
ing produced is E(K, L), where L is the fixed land area and K is capital 
inputs. Total housing demand is per resident housing demand h(-) 
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multiplied by the endogenous population N. The arguments of h(-) 
are prices, public service levels, and either income or utility as noted 
later. In housing market equilibrium in the community 

F(K,L) * N ■ *(•)• ( 1 ) 

I'he specification of the fiscal side depends on the model in question. 
We now turn to the three models. 

Local Public Goods: Tiebout without Politics 

With no politics, the community is modeled as a club, where there is a 
dub owner—in this case a group of land developers who act jointly 
through, say, a single land management company. The developers 
choose per person local public service levels, g, and property tax rates, 
/, to maximize profits. Profits are land rents plus taxes less public 
services expenditures, or 

■n = pi T + tpF(-) - c,Vg, (2) 

where j>r is the juice the developers charge tor land, p is the price of 
housing, so tpF are total property taxes, and c is the constant cost of a 
unit of public services, which are modeled in the literature as ,1 
Samuelson private good. To be more general, in equation (2) the land 
company does not actually provide the housing, so that contiactors or 
individuals can put up the structuies. 1'he results are identical if the 
company does provide the housing, so that profits are redefined as 
pi 1 + t)F(-) - rNg. 

The land management company is constrained by equilibrium in 
the housing market as in equation (1) It is constrained by the build¬ 
ers’ (or its own) c hoice of capita) in housing where capital is chosen 
according to the usual marginal productivity condition. 01 

pK ~ pf'K- (3) 

whcie/i* is the fixed cost of capital and F K = flF(-)/d A'. Land rents are a 
residual, or 

p,L = pF{) ~ p K K. (4) 

Finally, the company is constrained by having to pay utility levels, V, 
to its residents equal to those prevailing in the local economy. The 
indirect utility in this community is V'(y, p, g), where y is exogenous 
income and p is the gross-of-tax price of housing; or 

p = p(\ +0. (5) 

Thus the company faces the constraint that 

V - V(y, p, g) = 0. 


( 6 ) 
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To solve for the characteristics of intracommunity equilibrium, we 
can proceed in one of two ways. We can do a constrained maximiza¬ 
tion problem, where equation ( 2 ) is maximized subject to equations 
(1), (4), and ( 6). 1 However, since the model is simple enough, let us 
differentiate equations (1), (3), (4), (5), and ( 6 ) to solve for how the 
variables ( p ,, p , F, and N) in equation ( 2 ) change, given market forces 
in the community as I and g change. Differentiating yields the key 
equations 


,V 


p = -(1 + 0 + (?) 

ph 


s 

l + /) + 

f* 

+ 7 - y] 

& (8) 



\ e* 

1 ph 



where a * indicates rate of change. Further, 


Pi = 



and fi 



(9) 


vvlieie m is the marginal evaluation of public goods, (dV/dg)/(8V/Sy), 
<md 0 *; and 0 t are capital's and land’s shares in housing revenue The 
measures of supply inflexibility, e*,; price elasticity of housing de¬ 
mand, t); and “complementarity" (7 > 0 ) or “substitutability" (7 < 0 ) 
between h and g are defined as 


€* = 


I'hhk 

F k 


> 0 


■n = 




( 10 ) 


y = ^ JL 

<>g k 

For a linear homogeneous F(-) function, = tt/./cr, where cr is 
the elasticity of substitution in production and hence, in (9), F = 
(V'8/.)<r/>. 

1 ben, maximizing profits in equation ( 2 ) with respect to g and I 
(given the impact of g and / on all variables in the community) we get, 
after substitutions. 


(gN = tpb\ 

m = C I - !,[</( I 7 + I)]' 


( 11 ) 

( 12 ) 


It is then also useful to specify L 
land company 


IN where / is the per person lot size set bv the 
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Given the housing consumption distortion caused by the property 
tax, equations (11) and (12) represent an “optimal” (i.e., second-best) 
solution. First, all property taxes are spent on public services, so that 
the land company does not exploit residents fiscally. Second, a sec¬ 
ond-best version of the condition for public goods consumption is 
satisfied. Foi y — 0, the marginal evaluation of public goods, m, 
exceeds marginal costs, c (given 1 - tj[f/(l 4- /)] < 1), so that public 
goods are “iinderconsumed” just like housing. This degree of under¬ 
consumption increases as T| or //(1 + t) increases, since increases in 
both these are associated with greater underconsumption of housing 
associated with the property tax distortion. This notion of undercon¬ 
sumption of g is adjusted according to whether li and g are comple¬ 
ments (y > 0) or substitutes (y < 0). 

In summary, in terms of intracommunity equilibrium, Tiebout 
without politics produces efficient solutions. That is, we have max¬ 
imized developer profits holding the utility of any residents constant. 


I .oral Public Goods: Tiebout with “Good” Politics 

Suppose public goods are provided by a government, which acts only 
to satisfy the demands of its homogeneous resident voters. The fixed 
amount of land in the community is supplied by passive landowner/ 
developers, who accept the highest bid offered for their land but who 
for the moment play no other role. In terms of intracommunity equi¬ 
librium, with a balanced budget and identical voters since g = tph/c, 
the indirect utility function may be written as 



Naive voters choose t to maximize (13) assuming p and h are un¬ 
changed. More sophisticated voters recognize that because / directly 
affects p - p( I 4 /), their h and necessarily, in their perception, 
everyone else's h will vary directly with l and indirectly with g (for 
complementarity or substitutability). Naive voters in maximizing (13) 
choose t such that m = c. More sophisticated voters will satisfy the 
optimality condition in equation (12). While there is some debate in 
the literature about whether to assume naivete or not (e.g., Epple, 
Filimon. and Romer 1983), one could argue that any government that 
proposed a / and g that satisfied equation (12) would be elected, since 
realized utility, V'[v, p( 1 4- /), g], would be maximized for any p. That 
is, voters do not need to be sophisticated, providing they are offered 
(and believe) the sophisticated alternative. 

In terms of the housing market, with migration, equilibrium must 
be such that the p (and the corresponding values of other endogenous 
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variables, including N ) that satisfies equation (1) is the p for which V 
- V[y,p(l + = 0, where i, g are the values that satisfy a balanced 

budget, and either equation (12) holds or m = c. V is the equilibrium 
alternative utility level for these residents in the rest of the economy. 

Before proceeding to the third model, we note that this good- 
politics formulation leaves the issue of potential migration and resi¬ 
dents’ interaction with the rest of the world vague. In the formulation 
residents obviously do not see themselves as explicitly constrained by 
migration in the utility they can achieve—they do go through the 
motions of maximizing utility in choosing g and t. Yet, as we will see, 
we do not particularly want them to perceive community population 
as being fixed. Basically, they simply ignore the potential impacts of g 
and t choices on population flows. If they explicitly perceive that 
population is fixed, in choosing g and / to maximize equation (13), 
besides a balanced budget they would see the constraints of equations 
(1) and (2). Then in maximizing equation (13), constrained by these 
interactions, they would choose g according to equation (15) below. 
This is part of the bad-politics equilibrium, where g is “overprovided” 
relative to equation (12) because it is recognized to be partially 
financed out of the rental income of absentee landowners (or a re¬ 
duced land purchase price if residents buy rather than rent). 


Local Public Goods: Tiebout with Bad Politics 

Tiebout with bad politics or at least one version of it is the intracom¬ 
munity solution in Epple and Zelenitz (1981). To enact it, both land- 
owners and residents assume passive roles and there is a presumably 
dictatorial local government that seeks to maximize tax collections less 
public expenditures, inhibited by the market constraint that they 
must pay to their residents the utility levels prevailing in the economy. 
Given the fixed utility residents must receive, the government is try¬ 
ing to usurp land rents, which for some reason they do not have direct 
access to. In maximizing tpF(-) — cgN they face the same market 
conditions as in equations (1), (3), (4), and (6), and the same response 
of p, F(-), and N to changes in t and g, as in equations (7)-(9). The 
result, as in Epple and Zelenitz (1981), is 

tpF(-) - cgN = -(-^/>F(-) > 0 (14) 

m = r _ 1 _ y ■ . (151 

1 - T)[t/( 1 + o] + iW[e A .(i + on' 

Relative to the optimality condition in equation (12), (15) is adjusted 
to deal with the impact of trying to extract maximal profits; g is 
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“overprovided” relative to (12) because it can in part be financed by 
land rents. However, equation (14) is the key result, representing 
fiscal exploitation. 

This notion of fiscal exploitation should be treated gingerly, since 
usurping land rents does not always involve bad politics. For example, 
if the government was “really bad” and confiscated the land, we would 
obviously return to the profit-maximizing, no-politics solution and 
equation (12). Second, if the government could impose head taxes in 
addition to property taxes, head taxes would exactly cover public 
expenditures and equation (12) would be satisfied. The property tax 
rate would be set to usurp the maximum land rents possible, subject 
to residents’ receiving V. In this case the tax rate is given by 


T his tax rate increases with supply inflexibility and land’s share in 
output. Note that in intercommunity equilibrium (next in Sec. 1 li) this 
head tax solution and the no-politics solution would not be equivalent, 
since with head taxes residents will face two sets of taxes, altering the 
general equilibrium outcomes and lowering the utility levels of resi¬ 
dents realized in the economy. 


li. Intercommunity Equilibrium 

Let us now turn to an examination of long-run equilibrium in the 
markets across communities. The examination is not an exercise in 
proving existence and uniqueness or dealing with the issue of (in)di- 
visibility (Westhoff 1977; Kllickson 1979; Vohra 1982). It is a state¬ 
ment and evaluation in the present context of the usual conditions 
necessary to prove existence and uniqueness. Given these conditions, 
my characterization of long-run equilibrium will generally be consis¬ 
tent with general equilibrium models where prices are uniquely deter¬ 
mined and solutions are Pareto efficient, or in my case with the prop¬ 
erty tax distortion, second best. In comparison with recent literature, 
the basic condition 1 will impose is that the price of land (of uniform 
quality) be equalized across communities in long-run solutions, just as 
prices of any commodity of uniform quality are equalized in general 
equilibrium solutions. This condition is absent from the hypothesized 
long-run solutions in Epple and Zelenitz (1981) and Yinger (1982) 
and perhaps in Bucovetsky (1982). If this condition is to be met, for 
reasons we will focus on, it will be necessary in general to give land 
developers an active role in the model. 

In my analysis, I consider two cases—one where communities are 
fixed in number and size and one where they are not. We will see that 
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fixing community numbers and sizes need not be critical. However, 
the assumption of flexibility is more consistent with the facts and 
allows for more flexible adjustment processes. Throughout 1 assume 
that the number of communities is sufficiently large to be consistent 
with the concepts inherent in models characterized by perfect compe¬ 
tition and divisibility. 


Fixed Community Sizes and Numbers 

1 will start by asking in what situations each of the three intracom¬ 
munity solutions depicted in Section I A is consistent with long-run 
equilibrium. Before starting, note that in all solutions, in equilibrium, 
land and housing supplied across communities must be consumed 
and all people must be housed. That is, in addition to intracommunity 
equilibrium conditions, for n communities, 

n 

X W<(-> = l 

X (17) 

X N, = N, 

i 

where N, is the population of community i, N total economy popula¬ 
tion, /,(•) the derived demand for land, and L total economy land 
supply. 

No politus .— The no-politics solution is generally consistent with a 
characterization where equilibrium is unique and prices of land of 
uniform quality arc equalized across communities. If the land com¬ 
pany in one community is earning either more or fewer rents by 
having land in one particular use versus another, then, respectively, 
either other land companies will convert their communities to this 
particular use or this community will switch uses until land prices are 
equalized across communities and competing uses such that equations 
(17) are satisfied. In equilibrium, the market for a homogeneous im¬ 
plicitly divisible commodity (land, in this situation) must clear at a 
uniform price. 

As a particular example, suppose there are two income groups who 
stratify into different types of communities. We assume that either 
stratification is “naturally stable," so the poor do not chase the rich, or 
exclusionary zoning or other exclusionary activity of landowners en¬ 
sures stability. The question is. What determines the allocation of the 
fixed number of communities between the rich and the poor? Land 
companies wilt adjust the land use of their communities until the 
allocation of rich and poor communities is such that within and across 
communities the derived demand for land in housing equals supply at 




260 journal of political economy 

equalized land prices. As another example, What happens if there are 
inefficient land companies? With free entry of entrepreneurs, at the 
limit inefficient land companies will be bought out and supplanted by 
efficient ones. 

Good politics .— The good-politics intracommunity solution will be 
the same as the no-politics solutions, if all communities are identical. 
With identical communities, provided g and I choices are always gov¬ 
erned by, say, equations (11) and (12). people will flow between com¬ 
munities to equalize housing and land prices such that demands and 
supplies of land within and across communities are equated. 

However, in the good-politics case, if again we have two income 
gioups between which we must allocate our communities, we have a 
problem. Without a mechanism involving land companies/developers 
that can reallocate community land uses, communities generally will 
not be allocated such that land prices are equalized throughout the 
model. Without such a mechanism any number of solutions are possi¬ 
ble. In comparing the sets of rich and poor communities, there will be 
a wide range of allocations of the lixed number of communities be¬ 
tween rich and poor and resulting divergent land prices between rich 
and poor communities consistent with intracommunity equilibrium." 
Moreover, without a land reallocation mechanism, within the set of 
either itch or poor communities any one community can provide 
services tncffu mitly (i.e., not governed by eqq. 11 1) and [12]) and still 
keep some residents, since the inefficiencies will be capitalized into 
lower land prices. 

Without a land market mechanism for allocating intercommunity 
land uses, the long-run equilibrium solution is in some sense arbi¬ 
trary— pci haps, in informal terms, a function of the history of which 
communities wore “occupied" by the rich versus the poor, or the 
efficient versus the inefficient, first. Histories are important, but 1 
believe they go hand in hand with the dynamic models, not traditional 
static long-run models. In a long-run equilibrium with a fixed number 
of communities, active land companies are needed to alter community 
land uses to equalize marginal products of land. However, if we in¬ 
troduce active land companies, politics become superfluous, since a 
no-politics equilibrium is equivalent to a good-politics solution with 
active land companies. 


- t his holds even it we impose “natural” stability of stratification, which means lhal 
given g li , p" in a rich community, poor people will not want to enter. For any g 11 , there 
will he a range (it p" where poor people will noi want to enter (Ellickson 1970. Westhoff 
1977; Epple el al. 1983). However, even it poor people want to enter (even in (he 
solution where land prices are equalized across communities), stratification can lie 
maintained by zoning (or by (he developer's simply refusing to sell to poor people) 
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Bad politics .—The analysis of the bad-politics solutions may by now 
be apparent. With active land developers/companies/owners, bad pol¬ 
itics are not possible because the landowners will collectively refuse to 
allocate their land in any community to those attempting to usurp 
their incomes. With “passive” landowners, usurpation is, of course, 
feasible. But the possibility for usurpation goes hand in hand with 
nonuniqueness of long-run solutions, and the then importance of 
history. 

The notion of a world where landowners can act collectively within 
communities to turn over entire populations may seem extreme. One 
can posit an adjustment process where community compositions can 
change over time, but that is in the realm of a dynamic world. In a 
static world, the more realistic instantaneous adjustment process lies 
in allowing community boundaries and numbers to be flexible. In 
essence the market for land becomes explicitly (rather than implicitly 
through changeover in land uses between fixed communities) like any 
other market, with communities able to annex and detach land at the 
perceived fixed market prices and with developers able to buy up 
masses of land to form new communities. I turn to this situation next. 


Endogenous Communities 

With land capable of being added to and detached from existing 
communities or used in the formation of new communities at a con¬ 
stant (perceived) marginal cost, capitalization and bad politics are 
ruled out because the owner of each unit of land has the option to 
alter his land use. Then any temporarily low-priced uses will contract 
and high-priced uses expand. This is the Hamilton (1975) world (al¬ 
beit without “perfect” zoning) where communities can be costlessly 
reshuffled and redesigned. 

The good-politics and no-politics intracommunity solutions are 
both feasible subject to the constraints that the inter- and intracom¬ 
munity derived demands for land exhaust supplies. The depiction of 
intracommunity solutions can easily be adjusted to deal with variable 
land supply. For example, with no politics, each landowner maximizes 
the profits from each potential resident {p/ - pj,)l -I- pht — eg, where 
pi. is the fixed opportunity cost of land and p L and p the prices charged 
for land and housing. Maximization is subject to the equal utility and 
the builder no-profit constraints in equations (4) and (6). The result is, 
again, equadons (11) and (12). 

The role of entrepreneurial developers in this context is less dra¬ 
matic than when there are a fixed number of communities of fixed 
size. It is the usual role played by entrepreneurs in general equilib¬ 
rium models. Entrepreneurs are there to set up new “clubs” or com- 
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rnunities (corresponding to firms) to supplant inefficient ones. Com¬ 
munities may be governed by voting rules or run autocratically by the 
entrepreneur. Entrepreneurs need to participate actively only to the 
extent necessary to reshuffle land from inefficient to efficient uses. 

The only problem with the endogenous community model is that, 
as always with constant returns to scale, firm or community size is 
indeterminate. To get determinate community sizes, we could, for 
example, as in the theory of the firm allow the unit cost of public 
services, c, to be a U-shaped function of A r . Then each community 
would he the size that minimizes i. Providing there are a large enough 
number of communities of each type, any lumpiness problems (frac¬ 
tions) that arise from trying to divide a fixed population among an 
integer number of specific-size communities effectively disappear. 


II. Directions for Future Work 

The analysis of this paper is placed in a long-run model, and in this 
context the roles of land developers and governments in part seem 
duplicative. In a dynamic context where durable land and capital are 
not so malleable and agents have horizons extending beyond 1 period 
the issues are altered. In this context, stronger differences between 
the roles of land developers and governments are brought out. Con¬ 
flicts among developers, governments, and residents are accentuated 
and achievement of efficient solutions is much more inhibited. To see 
the conflicts and issues involved, 1 will illustrate the type of problem 
and point out the direction for future work. 

As an example, consider a 2-period world and look at one commu¬ 
nity. hi period 1 initial residents move into the community where they 
plan to stay for the second period in the same house. Housing con¬ 
sumption < hosen in period 1 by initial residents is also their consump¬ 
tion in period 2. In period 2, there may he new additional entrants 
and f urther development of the community- In period 2, the commu¬ 
nity will have a history—a stock of durable houses, perhaps a charter, 
and a set of laws and zoning regulations. Moreover, in period 1 the 
actions of economic agents will he af fected by their expectations as to 
future public policies. As one example, the purchase decisions of 
initial residents in terms of their willingness to pay for housing in this 
community and their choice of housing consumption levels will be 
critically affected by their expectations about what will happen in 
period 2 in terms of future public service levels and f uture commu¬ 
nity tax bases and rates. 

In period 2, when new people enter, there will be a conflict between 
the developer and initial residents over what public service levels to 
set and what sizes to zone lots for new entrants. Initial residents will 
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want to zone lots to be very large so as to maximize the tax payments 
of new residents, and they will want to set public service levels to 
satisf y just their own preferences. Developers will want to set small lot 
sizes to maximize net-of-tax land sale profits and to set public service 
levels to satisfy just the preferences of new entrants so as to maximize 
their willingness to pay to live in the community. Pareto-efhcient lot 
sizes and public service levels will differ from both these solutions, 
reflecting, respectively, both tax revenues and land sale profits and 
preferences of both new and old residents. 

As analyzed in Henderson (1980), these conflicts are resolved and 
Pareto-efficient solutions achieved in a no-politics situation by the 
developer’s choosing both current and future lot sizes and public 
service levels in period I so as to maximize the present value of 
profits. Maximizing the present value of profits means the developer 
takes into account the impact of his expected future policies on the 
actions (willingness to pay) of initial residents. However, actually 
achieving this optimal solution requires that the developer fix con¬ 
tractually, from the beginning, his own future policy actions and 
those of residents, with expectations being realized. As noted in the 
previous paragraph, the problem is that once into the future it is 
financially advantageous for the developer to break the contract and 
set lot sizes and public service levels at other than Paieto-efficient 
levels. Initial residents realize this and hence demand (in competitive 
development markets) in period 1 contractual guarantees. These are 
illustrated by the homeowner association contracts in “new towns" in 
the United States (Reichman 1976). 

This problem is a classic illustration of a time inconsistency problem 
(Kydland and Prescott 1977) where consistent solutions in which 
profits are maximized period by period, taking past decisions of resi¬ 
dents as given, are not optimal. Optimality may require the removal 
of discretionary policymaking in period 2. In a development context 
the policy instruments of concern are future public service levels, 
taxes, zoning laws, and development rights (the flee issuance vs. de¬ 
nial of building permits). 

Future work is needed to determine answers to the following ques¬ 
tions that arise in this context. With good politics does this same type 
of problem come up? Or, under certain institutional arrangements, 
can consistent solutions be optimal ones? What would the possible 
institutional arrangements be? If under typical institutional arrange¬ 
ments optimal solutions are not possible, what types of consistent 
solutions will emerge and what will be their characteristics with or 
without politics? How do initial residents and/or developers manipu¬ 
late decision variables to influence one another's actions to improve 
their own welfare? In a multiperiod model what role would “reputa- 
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lion” play in enforcing contractual arrangements (Barro and Gordon 
1983)? Finally, how do land markets operate to allocate land across 
communities, given expectations about how efficiently communities 
will operate in the future? 
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Discriminatory, Status-based Wages among 
Tradition-oriented, Stochastically Trading 
Coconut Producers 


George A. Akerlof 
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A robust model of discrimination is presented; even if there is a 
significant minority without a taste for discrimination and even if 
there is capital transfer among entrepreneurs with different tastes 
tor discrimination, no entrant ran profit by violating the discrimina¬ 
tory custom. The key innovation in this model or discrimination is 
that markets are in some sense smaller than the Walrasian market. 
All traders have a chance of trading with one another. And at the 
time of trade there is no other equally satisfactory alternative trading 
partner. This assumption corresponds to empirical sociological stud¬ 
ies that similarly find markets to be small. 


This paper presents a model of discrimination in which trade in 
gtxxis occurs in random encounters between agents. The assumption of 
Becker (1957) that trade occurs in Walrasian markets is altered, making 
it easier to explain wage differentials for labor of a preferred type 
(W-labor) relative to an unpreferred type (5-labor) of equal quality. 

There are two reasons why such differentials are not explained by 
Becker’s model (see Arrow 1972). First, the proportion of nondis- 
criminatory entrepreneurs and nondiscriminatory purchasers of 
goods need be no greater than the proportion of the unpreferred 
labor type for the disappearance of the equilibrium wage differential. 
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Second, in an equilibrium in which the marginal entrepreneur hiring 
0-labor has a positive taste for discrimination, those who hire 0-labor 
could profitably buy the capital of entrepreneurs who hire W’-labor 
with resultant decreases in the wage dif ferential. T hese properties of 
Becker’s model occur because those who “make the market” for B- 
labor and 0-produced goods are those with the least discriminatory 
tastes: the least discriminatory entrepreneurs hire 0-labor because 
they ate willing to pay the highest wages to 0-labor; and the least 
discriminatory purchasers of 0-produced goods are willing to pay the 
highest prices for such goods. In contrast, this paper constructs a 
model in which all individuals, not just the least discriminatory, are 
potential traders with firms that use 0-labor. And, as a result, even 
with a significant minority of nondiscriminatory traders, a nondis- 
cruninatory entrant may not be able ptoftiably to disobey prevailing 
social customs. 


I. The Model Explained 

A model, following Diamond (1982), is constructed with the key fea¬ 
ture that trading partners are those who randomly meet in a search 
process. T his contrasts with the Walrasian model in which trading 
partners are any pair of agents, with the buyer willing to pay at least 
the market price and the seller willing to accept at least the market 
price. In Diamond's model, because trading occurs only when part¬ 
ners meet, the loss of any potential trading partner has a cost, since an 
agent boycotted by any particular agent will not he able to sell im¬ 
mediately at the market price. Because any agent is a potential trading 
partner and the loss of a trading partner is costly, every purchaser 
with a high discrimination coefficient imposes a cost on an entering 
firm that disobeys discriminatory social customs. 

T he next section will describe the equilibrium of a model in which 
there is a universally followed discriminatory custom in production 
that is inefficient in its use of labor and consequently raises produc¬ 
tion costs. It is then asked whether a maverick entering firm could 
profitably break the social custom when a fraction of traders will 
boycott the firm that has broken the social custom. (In Becker’s 
framework these buycotlers would have high buying-discrimination 
coefficients.) 

The differences for discrimination between the Walrasian model 
and the random-trade model are illustrated in figures la and lb, 
which plot the cost of production plus sales as a function of the frac¬ 
tion of boycotting traders. In both models the costs of production are 
lower for the nondiscriminatory entrant than for the discriminatory 
entrepreneurs. But in the Walrasian case the cost of sales is indepen- 
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Fit. i.— Cost of production and sales for nondisct immatot y ciHianl and discnimua- 
tory lirm in Walrasian (l«) and landoin-trade (I />) models. In lmlb l« and \k, (r + t) 
lepre.seius llir combined tost of piodutliou and sales of a hrin dial follows the discrimi¬ 
nators’ custom as a iuiHlion of the |)i'llerilam’ of bovcomng huvers. Similarls, <r + s)' 
lepresenls die < osl of pi odunion and sales oi a nondisrnniniaiorv enirani as a film lion 
ol die percentage of boycotters In die Walrasian model (1 + 1 )' is flat up to 100 perteni 
host oners In the random-trade model (i f 1 )' rises tontmuousls ssitlt the peitentage 
ol boviolleis II the pertemuge of bovcollns exteeds 5' in hg l/>. nondismminalon 
entiants make lowei profits than firms that follow the disc 1 unmaiorv custom 


dent of the number ol boycotters as long as the nonhoycotters are a 
larger share of the market than the entrant’s production. If the en¬ 
trant is sufficiently small, his cost tit sales rises only as the number of 
boycotters approaches 100 percent. In this case he must pay a sales 
premium equal to the discrimination coefficient of the least discrimi¬ 
natory boycotter. This is pictured in figure 1«; the cost of production 
and sales of a nondiscriminatory entrant into a market witfi all goods 
produced according to the discriminatory custom is plotted as a func¬ 
tion of the percent til boycotters. This maverick has tower costs than 
the discriminatory firms if the fraction of boycotters is less than 100 
percent. 

In contrast figure 1 b plots the cost of production plus sales for an 
entrant into the same market in a mode! with random trade. If the 
fraction of boycotters is zero, then the costs of the entrant are lower 
than the costs of the discriminatory firms, because the entrant uses 
labor efficiently in production. But as the number of boycotters rises, 
the search time to make a sale by the entrant increases and, conse¬ 
quently, so do sales costs. As a result, the entrant’s total costs of 
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production plus sales may, as pictured in figure 1 b, rise in excess of 
the total costs of a discriminatory firm, Even with a significant minor¬ 
ity of buyers with no taste for discrimination (i.e., a minority smaller 
in proportion than 1 — 8' in fig. 1 b) an entrant even of si/e zero 
cannot profitably break the social custom. Furthermore, unlike Beck¬ 
er's model, there is no excess profit to he made by entrepreneurs with 
no taste for discrimination. And entrepreneurs with a low taste for 
discrimination cannot profitably purchase the capital of those with a 
high taste for discrimination. 


The Assumption oj the Random Appeal ante of Trading 
I’at tuns 

The model of trade that occurs between partners each of whom 
values the other's patronage because the market is fairly small corre¬ 
sponds to an empirical view of markets consistent with sociological 
studies. Work on "weak ties’’ following Granovetter (1973) has shown 
that many exchanges in both factor and product markets occur 
through mutual contacts. According to Macaulay (1963), engineering 
firms value their customers, and as a result most business transactions 
are on a less than strictly contractual basis; many transactions that are 
costly to one party and beneficial to another (such as the cancellation 
of orders) are performed without consideration. Rendering of such 
services to “valued customers” could be profitable only in a market in 
which alternative trading partners were scarcer than in the Walrasian 
model. 

II. The Formal Model 

Diamond’s (1982) model will be revised for the purpose of parsimoni¬ 
ously demonstrating the possibility of discriminatory equilibria. As in 
Diamond’s model, economic activity occurs on a tropical island and 
consists of picking and marketing coconuts. There are N carts (i.e., 
firms, each of which owns one unit of capital) on the island. The 
length of time to fill a cart with coconuts depends on the number of 
labor efficiency units in two types of jobs, numbered type 1 and type 
2. A cart working with /V, labor efficiency units in type 1 jobs and N% 
labor efficiency units in type 2 jobs can be filled in length of time 
N f' ‘ATCarts leave the coconut groves after they have been filled 
for a market area where other carls with the similar purpose of trad¬ 
ing coconuts will be randomly encountered. 

The islanders search for other carts with which to trade coconuts 
because they have utility for coconuts, but there is a taboo against the 
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consumption of coconuts gathered by one’s own firm. The islanders 
are also quite traditional: firms that break the social custom of the 
island by hiring men or women in different proportions from other 
firms will be shunned by some fraction of the traditional firms (i.e., 
some firms that obey the traditional hiring practices will not trade 
with other firms that depart from the tradition). 


Production 

'['here are two distinct groups of people in this model, men and 
women. There are L men and L women. One-half of the men, appro¬ 
priately named type 1 men, have a comparative advantage in type 1 
jobs. They contribute one labor efficiency unit in a job of type 1 and (3 
labor efficiency units in a job of type 2, where {3 is less than one. Type 
2 men, who are also LI 2 in number, have the opposite comparative 
advantage: type 2 men contribute (3 (< 1) labor efficiency units in jobs 
of type 1 and one labor efficiency unit in jobs of type 2. 

Women’s productive abilities are exactly like men’s. One-half of all 
women are of type 1. These women, like type 1 men, contribute one 
labor efficiency unit to type I jobs but only p (< 1) units to type 2 jobs. 
Symmetrically, the other half of all women are of type 2: they contrib¬ 
ute P (< 1) labor efficiency units in type 1 jobs and one labor 
efficiency unit in type 2 jobs. 

Trading: Probability of Meeting 

Carts are met at a rate proportional to the number of carts in the 
marketplace. Any individual cart is either in the marketplace trying to 
sell a load of coconuts or else in the coconut groves being loaded with 
coconuts. Any two carts in the marketplace will meet randomly with 
probability (lly)dt in a short period of time dt. The probability of a 
particular cart’s meeting another in the marketplace is {(.S' — 1 )/y]dt 
when there are ,S carts in the marketplace. We assume that one is 
sufficiently small relative to S that it is suppressed in the rest of the 
paper. 


Trading: Discrimination in Trade 

If two carts meet in the marketplace but one cart has used male and 
female labor in jobs 1 and 2 in different proportions from the norm, 
only with probability 1 — 8 (0 < 5 < 1) will a trade occur. The variable 
8 may be dependent on the deviation in proportions of men and 
women in jobs 1 and 2; it may also be dependent subtly on the social 
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customs in rather complicated ways. For the purpose ol the demon¬ 
stration here of the existence of discriminatory equilibrium, it is as¬ 
sumed that if the proportions are at all different from the norm, then 
8 is a positive constant. 


The Nature of the Bargain between Two Trading Carls 

When carts meet they trade (except in the case of discrimination 
against an innovator). The barter price of coconuts is indeterminate 
unless a bargaining solution is specified. 1 assume an axiom of sym¬ 
metry. If the carts are in exactly symmetric positions tfie trade of 
coconuts will be one-to-one. However, if the nade has more value to 
one cart than another, the cart for which the trade has more value is 
in a weak bargaining position. Trade will occur at less than a one-to- 
one rate for the disadvantaged cart. 


The Maihet for Labor 

Labor supply of both men and women is totally inelastic. There is no 
discount rate in this economy. Both labor and owners of carts w'anl to 
maximize the undiscounted value of their income in terms of 
coconuts Thus, cart owners maximize the expected returns cm tarts 
per unit time; the competitive w'age rate is the marginal product of a 
laborer of a given type in production on his own cart. A competitive 
labor market is assumed, so that this is the wage actually received. 

Ill, The Nondiscriminatory Equilibrium 

Note that the model described in the previous section would exactly 
correspond to the standard neoclassical model if the coefficients of N 1 
and N-> summed to less than unity, which is a matter of no importance, 
and if 7 were zero, which is a matter of great importance. If it takes 
length of time N 1 '"A'-f Qv (a ( + a-) < 1) to fill a cart, then 

,V“W“W‘ "“-is a neoclassical production function for output. With 

7 = 0 the only equilibrium possible in this model is the neoclassical 
equilibrium with labor and capital receiving their respective marginal 
products. 


Factor Allocations 

Let us consider a natural nondiscriminatory equilibrium of this 
model. Jobs 1 and 2 will be (tiled by men and women in equal number, 
with jobs of type 1 divided equally between type 1 men and women 
and jobs of type 2 similarly divided between type 2 men and women. 
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Division of Carts between Production and Trade 

Let 0 denote the proportion of carts engaged in selling; (1—0) will 
then be the proportion of carts engaged in production ot coconuts. It 
is possible to discover what the value of 0 must be in this natural 
nondiscriminatory equilibrium. Having solved for the equilibrium 
value of 0, denoted 0*. it will then be possible to describe all the key 
variables in this economy; wages for men and women of types 1 and 2, 
and also profits. 

1 ’he ratio (1 - 0*)/0* will be equal to the ratio of the time to fill a 
cart to the time it takes to sell a cartful of coconuts once they have 
been taken to market. In this nondiscriminatory equilibrium the 
length of time to fill a cart must be 

as can be found by substitution into the earlier formula of the number 
of labor efficiency units in type 1 and 2 jobs. The time to sell a cartf ul 
of coconuts is 

J_ = 7 

s e*/v ' 

Thus in equilibrium 

1 - 0 * = (jv) 

0* 7 

0*,V 


( 2 ) 


(‘ 1 ) 


Equation (3) is a quadratic equation with a positive solution for 0*. 


0 * 


- 1 + Vl + lx 
2x 


( 4 ) 


where x = N^lyL. A bit of calculus shows the relation between 0* and 
x according to (4). Forx = 0, by L’flospital’s rule, 0* = 1. For x = 

0* = 0. And for x between zero and infinity, c/0*/c/.\ < 0. 


Wages and Profits 

In the nondiscriminatory equilibrium, the length of time to produce 
and sell a cartful of coconuts is 



e*iV 


( 5 ) 
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Accordingly, output per cart per period is the reciprocal of (5) or 



( 6 ) 


'Hie wage of type 1 workers «q is 


a 

<Vi 


_i_ 


7 

e*/v 


l/,=7 =/./,v 

H» = RIIS of (I) 



!LY' 

+ 7 

\n) 

a*N . 


( 7 ) 


where J, represents the number of labor efficiency units in a job of 
type i,i = 1, ‘2. A type 1 worker embodies one labor efficiency unit. 
By symmetry the wage of type 2 workers is the same as that of type 1 
workers and is also given by the right-hand side of (7). The share of 
profits can be calculated using the definition of profits as production 
net of labor costs and is equal to 


-f*-. (8) 

d.y' + _L_ 

l N) 0*,V 

an expression that is always between /.eru and one, as theory suggests 
it should tie. 

These explicit calculations of productivity (fi), wages (7), and profit 
share (8) will allow comparison with the discriminatory equilibrium 
described in the next section. 

It should be obvious that if the allocation of labor across jobs is as 
described above no firm could enter, pay higher wages using labor in 
different proportions, and earn positive profits even in the total ab¬ 
sence ol discrimination (i.c„ with 8 = 0), since, given wages, labor is 
used in the most profitable fashion. Therefore, provided 8 is at all less 
than one fot an innovating linn, profitable innovative entry will not 
be possible in this equilibrium. 


IV. A Discriminatory Equilibrium 

Factor Shares 

The preceding nondiscrirninatory equilibrium is to be contrasted with 
discriminatory equilibria in which men and women receive different 
wages, although not in the same jobs. Consider an equilibrium in 
which all women and all type 2 men work in type 2 jobs. Type 1 men 
work in type 1 jobs. Given what wages of men and women of the two 
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types must be for such an equilibrium, it will be shown that values of 8 
can be chosen so that no innovative firm can profitably hire men and 
women in different proportions. 

In this equilibrium output per cart per period must be 


ULY*(L + t.L\ 

\ Nj \N 2 Nj 


-!* 


7. 

S 


(9) 


The first term in the denominator of (9) represents the length of 
time to fill a cart, given L/2N labor efficiency units in jobs of type 1 
(from type 1 men) and (L/N) + (3/2)(L/iV) labor efficiency units in 
type 2 jobs: LIN of these efficiency units come from type 2 men and 
women together; (p/2 )(L/N) of these come from type 1 women work¬ 
ing in type 2 jobs at efficiency p. The second term of the denominator 
of (9) represents the length of time to sell the output of a cart. 


Number of Carts Engaged in Marketing 


As before it is possible to solve for .S’. Let 6** be the proportion of 
carts engaged in selling, so that 6’ = Q**N, and, as before, 


1 - 6 ** 
Q** 


‘At 


+ JL M' l/S 

n) \n 2 n) 


7 

Q**N 


( 10 ) 


This yields a root for 0**, 

0 ** = - 1 + 0 + 4x)» x = _ 

2x 

yLm 

Relative to the nondiscriminatory equilibrium, the denominator of x 
has decreased so that the proportion of carts that are selling has also 
decreased. It takes longer to sell output. 


N* 


1 + 


PV* 


(ii) 


Wages and Profits 

The wage rate per efficiency unit in type I and type 2 jobs is found by 
taking the derivatives with respect to J 1 and J 2 , respectively, of expres¬ 
sion (12) for output per unit time or the derivative of 


_ 1 _ 

/f’l/f* + 


7 

e**N 


( 12 ) 


evaluated with J 1 = LI2N,J 2 = L[1 + (p/2)]/W, and 0** given by (11). 
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Doing this algebra yields an expression for the wage of type 2 labor 
that is unambiguously smaller than in the nondiscriminatory equilib¬ 
rium. There are three reasons: in the new equilibrium the number of 
substitute labor efficiency units in type 2 jobs has risen, the number of 
complementary labor efficiency units in type 1 jobs has fallen, and the 
length of time to sell the product has risen. The wage per efficiency 
unit in type 1 jobs changes ambiguously. I'he number of substitute 
labor efficiency units in type 1 jobs has fallen, and the number of 
complementary labor efficiency units in type 2 jobs has risen. Both of 
these effects should raise wages in type 1 jobs, but the length of time 
in selling has risen, and this may more than counteract the other two 
ef fects. 

Profit share in the discriminatory equilibrium is given by a formula 
analogous to (8) and is calculated in similar fashion. Profits per cart 
are 


7 

e**v 



+ -X_ 

B **y 




(13) 


It is of use later that this expression is always positive. (Note also that 
it is easy to show, using [9j, (13], and S = 6 **N, that profit share is 
between zero and one.) 


Choice of 5 So Thai Innovative Entry Cannot Be Profitable 

Now consider whether an innovator might use women of type 1 in 
jobs of type 1 and thereby make a profit, given the existing wage 
structure. Remember, however, that a fraction 5 of noninnovative 
carts will not trade with the maverick. Can a lower bound be found so 
that for any higher level of 8 profitable entry cannot occur? That is 
the question of this subsection. 

Before answering this key question, let me first deal with a technical 
issue concerning the bargaining between the traders. Among the 
traders previously described in this equilibrium all are alike: there¬ 
fore trades of coconuts will occur at a rate of one-to-one. The 
maverick firm will have lower production costs but a longer wait for 
potential buyers, since some fraction 8 of all buyers will refuse to 
trade. Therefore the innovators are in a weaker bargaining position, 
since the cost of failing to make a trade is greater to them than to 
noninnovators, and therefore they will barter their coconuts on a less 
than one-for-one basis. Let the expected barter rate be p < 1. 

Let the innovator hire iV{, (I for innovator) workers in jobs 1 and 
2 on his cart. He will hire women of type 1 in jobs of type 1. Such 
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women receive a wage of per labor efficiency unit in type 2 jobs; 
they supply only P labor efficiency units. Thus their wage is Pr/'o. In 
type 2 jobs the innovator can arbitrarily decide the proportion of men 
and women of type 2 , paying each tc 2 , or he can hire women of type 1 
paying Pa ’2 for p labor efficiency units. The entrepreneur's profits IT 
are 


IT = p 


(N\) 


+ 


pWV( - w.jNL (14) 


(1 - 8 )0**/V 


It is possible, but involves very complicated formulae, to calculate 
the optimal cV(, N-> given w-> and 8 and find an exact formula for the 
critical level 8 r . For values of 8 above this critical level innovative entry 
is not profitable; for values below this critical level innovative entry is 
profitable. For the crude purpose of establishing an upper bound for 
this 8 ', let us make a very crude approximation. Note that 

II' £ -!-, (15) 

7 

(1 - 8)0**cV 

using (14), p < N f U 'N 2 s f), pie.,A'] 3: 0, and /, 2 0. It fol¬ 
lows using (15) that the profits of noninnovators (given by [13]) will 
be greater than the left-hand side of (15), provided 



The right-hand side of (16) gives a crude upper bound on the critical 
value 8'. The crudeness of this bound is directly related to the 
crudeness of (15) as a bound on the profits of the innovating firm. 
Nevertheless, it has been demonstrated that suitable values of 8 can be 
chosen so that innovative firms cannot profitably enter in the posited 
discriminatory equilibrium. 1 


1 One discriminatory equilibrium was analyzed, Clearly, many discriminatory equtiib- 
■ m ate possible. If, as posited, avoidance depends on depart 11 re irom the status quo use 
of die different sexes, there are four symmetric discriminatory equilibria The roles ol 
men and women can be symmetrically reversed, as can the roles of jobs 1 and 2. 
furthermore, there is no reason why equilibria must of necessity be a corner solution, 
as in the example, with all women and type 2 men in type 2 jobs. There are whole 
conunua ol possible equilibria in which there is avoidance of those who depart from Ihr 
status quo. 
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A model of discrimination has been presented that is robust; even if 
there is a significant minority without a taste for discrimination and 
even if there is capital transfer among entrepreneurs with different 
tastes for discrimination, no entrant can profit by violating the dis¬ 
criminatory custom. Fite key innovation in this model of discrimina¬ 
tion is that markets are in some sense smaller than the Walrasian 
market. All traders have a chance of trading with one another. And at 
the time of trade there is no other equally satisfactory alternative 
trading partner. This assumption corresponds to empirical sociolog¬ 
ical studies that similarly hud markets to be small. 
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T his paper considers the question of how optimal assignments de¬ 
pend on the endowed traits of economic agents. Its most useful point 
is that the absolute levels of these traits, in conjunction with technol¬ 
ogy, always determine the structure ol efficient allocations. Only 
under very special circumstances do summary measures of the struc¬ 
ture of trails, such as comparative advantage, succeed m performing 
this function unless they do so tautologically. 


Assignment problems form an important class of questions in a wide 
range of subdisciplines of economic science. In labor economics the 
assignment of workers to tasks within the firm, or family members 
within a household, is studied; international trade theory has one of 
its principal concerns the assignment of countries to commodities; 
operations research analyses problems such as the assignment of lim¬ 
ited capacity to product lines, and so forth. 

The Principle of Comparative Advantage has become something of 
an article of faith concerning the solutions to many of these assign¬ 
ment problems. While there does not appear to be any unambiguous 
definition, assignment by comparative advantage is usually taken to 
mean assignment on the basis of a summary measure of endowed 
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traits, particularly relative characteristics, whether they are workers’ 
skills, national factor endowments, or other attributes. In a simple 
two-attribute case, an assignment by comparative advantage can be 
defined as an assignment on the basis of the ratio of the two attributes 
possessed by the individual workers, country, and so on. 

It is clear that this was the original intention of Ricardo, who is 
widely credited with first establishing the Principle of Comparative 
Advantage. Ricardo argued that assignment of countries to the pro¬ 
duction of goods on the basis of relative labor productivities was op¬ 
timal quite independent of absolute productivities. The Heckscher- 
Ohiin model of trade continued this tradition by showing that 
countries should be assigned to (though not necessarily fully special¬ 
ized in) goods using intensively the country’s relatively abundant fac¬ 
tors. Once again, (he optimal assignment depends only on relative 
attributes, in this case factor endowments. 

More recently, the Principle of Comparative Advantage has been 
formally established in labor economics. Rosen (1978) analyzed the 
optimal assignment of heterogeneous workers. For a partic ular type 
of technology, he obtains the following very strong result. Suppose 
there are just two activities (a and 3) in which workers might engage 
and many types of workers. Then, m an optima! assignment, each 
worker participates in only one activity. Further, when worker types 
are ordered according to declining comparative advantage at activity 
a relative to activity p, there is a marginal worker type such that all 
workers with comparative advantage gteater than that of this mar¬ 
ginal worker type perform only activity a, the rest performing p. 

Although these results suggest a strong case, assignment by com¬ 
parative advantage does not square well with experience. It is not 
persuasive that the employee with the highest comparative advantage 
in management should become president. Indeed it is plausible that 
the presidency assignment will have something to do with absolute 
advantage; alternatively, a person with poor management skills will 
not be chosen even if he is relatively worse at every other task in the 
firm. Further, in academic economics, generally poor economists will 
not he chosen as department chairmen even if they have a compara¬ 
tive advantage in these activities relative to research and teaching. 

International trade economists have recently been coming to grips 
with these issues. A prediction that follows from the Ricardian and 
Heckscher-Ohlin comparative-advantage models is that identical 
economies will not trade. Yet, that countries with apparently similar 
technologies and factor endowments seem to trade large volumes of 
manufactured goods with one another can be taken as evidence that 
assignments do not depend entirely on comparative advantage. In 
deed, trade arising from scale economies in models with identica 
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countries is now referred to as “non-comparative-advantage trade” 
(Melvin 1969; Krugman 1979). That specialization may be optimal 
independent of differences in comparative advantage is a special case 
of a more general failure of the comparative-advantage principle to 
predict assignments. 

The purpose of this paper is to construct a model with a fairly 
general technology and examine the structure of optimal assignment 
based upon it. The assignment of workers within a firm is taken to be 
the motivating example, with applications to other assignment prob¬ 
lems considered toward the end of the paper. In the firm model, 
heterogeneous workers must divide their time (they need not 
specialize) between two tasks, a and f$. Among other things, it is 
shown that only in a razor’s-edge case can comparative advantage in 
terms of workers’ relative talents in performing the tasks required for 
production predict the structure of the optimal assignment. In con¬ 
trast, absolute skill levels, along with technology, determine this struc¬ 
ture. It is in this sense that absolute advantage is rehabilitated by this 
analysis. 

The essential features of the technology used to reach this conclu¬ 
sion are that: (i) workers of a given skill level may experience decreas¬ 
ing or increasing marginal product with respect to the amount of one 
task performed; and (ii) one task may be a public input such as re¬ 
search or administration within the firm. Decreasing marginal prod¬ 
uct arises for the usual variable proportions reasons, a concept that 
could be interpreted as covering boredom and fatigue. Increasing 
marginal product corresponds to a warmup notion or indivisibilities 
within the task itself. The existence of a public or joint input is a 
justification for the existence of the firm in the first place. 

The paper is structured as follows. Section 1 describes the labor 
force and workers’ endowed traits, and the maximization problem 
laced by firms makes up Section 11. Optimal assignments are charac¬ 
terized in Section 111; extensions to other settings follow in Section 
IV. The final section summarizes and concludes. 

I. Skills and the Labor Force 

Each worker in the labor force is endowed with the ability to perform 
two tasks, called a and 0, where only one task may be undertaken at 
any point in time. There are many different types of workers, dif fer¬ 
entiated by the quantity of tasks a and p that each can execute in one 
period of work; t = ( t a , 1$) is the vector of such attributes. Thus a 
worker of type t can perform t a of task a, or /p of task p, or some 
combination depending upon the allocation of time over tasks. 

The supply side of the labor market is completely described by a 
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continuous density /.(■), having support If C JR'i . This specification 
assumes that the period of work is predetermined and that workers 
regard hoth tasks as equally distasteful. 

II. The Firm’s Problem 

In this section the maximization problem faced by firms is set out. 
Initially, attention is focused on the technology available to firms. 
Subsequently it is shown that the optimal assignment problem can be 
studied as a particularly straightforward first stage in a two-stage 
maximization problem. 

Firms are assumed to be perfect competitors in both product and 
factor markets, selling output X at price p. All have access to the same 
technology, in which labor is the only variable factor of production. 
The number of workers of type / employed by the firm is described by 
a measurable function l(t), positive for t E u>, where u> C fl is chosen by 
the firm. The wage paid to workers of type t is w{t), where w(-) is 
positive, monotonirally increasing in both arguments, and continu¬ 
ous. The wage depends only on a worker's skills (and specifically not 
on his assignment) by virtue of the assumption that both tasks are 
equally distasteful. 

The technology available to the firm delineates the relationship 
among output, the number of workers of each type, the skills of each, 
and the manner in which the workers' time is allocated between tasks. 
Output (X) is a monotonirally increasing anti strictly concave function 
of the aggregate (0 of individual outputs, </(•)■' 

X = x(Q), x' > 0 , x" < 0 . ( 1 ) 

x(-) is concave as a result of the existence of fixed factors, ignored in 
what follows. Q is assumed to be a linear aggregate of individual 
outputs q: 

Q = f qW). t,„ mW)dt. (2) 

Au 

The individual output of a single worker of type l, </(<), requires (i.e., 
all inputs are necessary) inputs of time spent on task a, tp(-), skill at 
task a ( t a ), and an allocation, £( )7p, of the aggregate quantity of task |3 
performed (7’ p ). 1 The function £(•) describes the allocation of 7’ p 
across workers, to be explained shortly. Note from (2) that in general 
one worker’s time spent performing task a does not substitute per¬ 
fectly for that of another in the production of final output. Imperfect 
substitutability occurs because at the individual level the marginal 

1 It is not necessarily assumed that q(-) is concave. The specific assumptions madi 
concerning q(-) are spelled out below as they are required. 
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product of time spent performing task a is not necessarily constant. 
Indeed, the notion that the marginal product of time spent perform¬ 
ing task a may be diminishing for the usual variable proportions 
reasons—rising ratio of <p to t a and Zj(-)rp—or increasing as a result of 
indivisibilities is key to the discussion of specialization, or the lack 
thereof, contained in the sections to follow. 

The total quantity of task 3 is given by 

n = f [i - <?m?m- o) 

J u> 

Equation (3) implies that there is perfect substitutability (in the pro¬ 
duction of aggregate task 3 ) between workers’ time spent performing 
task 3- Perfect substitutability is not crucial for the general flavor of 
the results presented below, but this specification permits an espe¬ 
cially simple characterization of the optimal assignment. 

The production of shirts provides an illustration of the ideas be¬ 
hind the structure (1)—(3). Task ot is the actual sewing of cloth into 
shirts, and task 3 the production of cloth, as well as administration or 
pattern design. Each worker is allocated a portion of the latter collec¬ 
tion, which he combines with his time and talent at sewing, to con¬ 
struct shirts ( q ). Shirts are aggregated to obtain the total production of 
shirts (Q), which are then transported and distributed using fixed 
inputs (not modeled) to arrive at “marketed shirts,” X(Q). 

The function £;(<) delineates the manner in which is shared out 
among workers: 4(0 — 0 satisfies 

f mm = /((/», (4) 

Ju> 

where [({/)) is a positive, real-valued functional defined on measurable 
f unctions Z(-) representing the firm’s labor force. The basic notion is 
that the way in which Tp may be shared out can depend on both the 
size and composition of the firm's work force. A number of special 
< ases are of interest./!-) s ] implies that 7p is a purely private input, of 
which each worker of type t receives a share 4(0- For example, task 3 
could be the cutting of material in the shirt factory. A special case of 
this purely private input setup is that in which 4 ( 0 ^b = [1 - <p( 0 fo: 
that is, each worker does all his own cutting, implying individual 
production functions are of the form q[<p, t a , (1 - <p)<p]. 

In contrast,/! ) = /Mf allows Tp to be treated as a pure public input 
(4H = 1 ), examples of which are a design for a shirt and administrative 
activities providing a work environment within which task a is under¬ 
taken. More generally, for work forces and /, distinct but of equal 
size (i.e., fl l dt = Jl‘ 2 dt), /((/')) # /((Z 2 )) embodies the idea that the 
private/public nature of the shared input 7p depends on the composi¬ 
tion of the work force. For example,/((/')) >/((/)) can be interpreted 
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as /' requiring less supervision than /'. In any case, so long as /(■) > 1 
there is an element of jointness, providing a reason for some collec¬ 
tion of workers to work together. 

The qualitative characteristics of the assignments studied below do 
not depend in any important way on the extent to which Tp is public 
or private, or whether £,{■) is a matter of choice given its public/private 
characteristics. Since much algebra is avoided in the process, 7@ will be 
t/eated as a pure public good: £(•) h l-* 

Given the technology available to the firm, the profit maximization 
problem is 


max 

*U)JW 


/*• 



G, 7 p\l(t)di 




F 


subject to 

V / € w, ()s <p (t) < 1 and l(t) 2 0. <i) 

/ tt = [ (i - viOVfsKDdt, (ii) 

JU> 

where <0 = {t\l(t) > 0} is the support of l(t) and F is the cost of fixed 
factors. 

This problem is greatly simplified by noting the following. Recall 
that under the assumption that individuals find all tasks equally dis¬ 
tasteful, iv(-) does not depend on tp(-). Given this result, the full profit- 
maximi/ing problem can be viewed as choosing ip(-) to maximize (7 for 
fixed /(•)—the “assignment problem,” yielding a maximal Q given /(■) 
denoted by '?({/(/))), the “derived” production functional—then 
choosing /(•) witfi cp(-) varying optimally —the standard f actor propor¬ 
tions problem. •’ 

The assignment problem can be studied in isolation provided the 
assumed /(•) is of the form that will emerge in equilibrium. The analy¬ 
sis to follow assumes that u> does not consist of isolated points (i.e., 
firms do not specialize in hiring a comparatively small number of 
types of workers). This assumption is consistent with equilibrium out¬ 
comes provided 'R(-) is a strictly concave functional. In this case, factor 
demands are obtained from 'l'(-) in the usual fashion, and market 
equilibrium can be constructed as follows. Aggregation over firms 
yields total factor demand /)((?<;(/))]. Supply is L(-), discussed above. 

* Rosen employed the technology X = x(T„, 7 P ), where 7 P is as above, and 7~„ = X 
<?(t)t a dt In terms of {I)—(4), this specification can be written q(f. I„, 7p) = if(t)(„/(7 p ) 
whence Q = 7 a /(7 p ). For comparison with what follows, note the linearity of </(■) in l„ 

’ When maximizing over more than one variable, it is obviously always possible to d 
so sequentially, solving for the optimal value ol a subset of variables conditional on ill 
others. The useful fact here is that under the assumptions made, the first-stage prol 
lem is to maximize Q. 
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Assuming the optimal firm size is not too large, f ree entry (recall F > 
0) will yield an equilibrium in which wages are such that firms choose 
/(•) = L(-)IN, where N is the equilibrium number of firms, and w = ft. 
Thus, assuming concavity of 'P, the assignment problem can be 
studied in isolation for arbitrary /(■) that are continuous on ft. 1 

Proceeding in this manner, the individual firm’s assignment sub¬ 
problem is 


max 

*<oe|<ui 



<<>(/). 



(p(v)]v->l(v)dv 


t(t)di. 


(• r >) 


where v = (vj, ia>) is a vector integration dummy ranging over ft. 
First-order necessary conditions for a maximum are (letting ip*[/] 


denote file optimal values) 

f> 0 

if (f>*(/) = 1 

</i(-) - /(aM- 

- 0 

if o < 4>*(q < 


[s 0 

if 4>*(0 = 0. 


where p = < r )Q/cJ7' (i = f q^(-)/{l)dl > 0; p is the efficiency price of a unit of 
task (i; and <j\ and are the marginal products (in terms of (F) of 
tune spent on tasks a and 3 , respectively. Note that p does not vaiy 
across worker types, winch is the simplification obtained by assuming 
T(j to be a pure public good. 


1 Eailuif ot 'E(-) to he concave m l[ ) may omit toi two reasons Oncislltal nuliudual 
out puls are aggregated lineal Iv u> obtain (£ (i c , tlicv ait* pet U‘( t substitutes). While tins 
assumption simplifies (lie analysis consult*!ably, it laises the possibilit\ ihat ii tna\ pay 
(ot lit ms Lo spet mli/e in tilling woi keis of a panic ulai ty pe; the pattern of assignment, 
which is the foe us of the analysis, may degenerate. At the com of geneiaiiug a good deal 
of algebra, it is possible to obtain results akin lo those piesentcd below, when Q is 
assumed to aggiegatc individual output m a constant elasticity of substitution (<’KS) 
fashion fn the interest of simplicity, (he perfect substitution assumption will be mam- 
lamed 1'be second sourc e of noftcoitcavn v of Mb') is the public input assumption. While 
public inputs provide a nice rationale fot the existence of the firm, they create a 
problem in that some offsetting iadoi must be imposed to ensure dial the Inin's 
optimal employment is Imne. To see just what is lecjmied. consider the employment 
decision The marginal lettnn to himig woi kers of type / is p\' • {rj + (l - ip(d]/ {1 gi}. 
when* ji. r dQ/BTfi - / > 0 Hiring one more wotkei of type t generates individ¬ 

ual output of <i plus (lor 0 <- cp[/| < I) nioie public input, yielding [1 - tpl/)]/p(0M- 
output from other workers I he term px ' converts marginal aggtegaic individual out¬ 
put into maiginal icvenuc Each additional worket of type t costs u»(0 AtioidingU, 
assuming it pays to hue some workeis of type /, a necessary condition fot /(/) lo be hmte 
is that maiginal returns fall at a rate bounded away from /eio Equivalently. 

+ 11 - cptokuM-F + /«'11 - + |i - <p(b|/ p |</,< 8 

fot some 8 < 0. The arguments familial from the theory of lilt- him literalure essett- 
Hally congest die positive externalities responsible foi him formation through the use 
of fixed factors. Here this type of effec t arises through .x" < 0—whit h obtains as a result 
of the rising ratio of Q to “othei factors" in production of final output—and < 0. 
oc cm ring because of an increasing ratio of private to public inputs in individual pro- 
due tton 
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Optimal time allocations ip*(-) can take on numerous forms. The 
two of primary interest are those for which (i) 0 < cp*(<) < 1 for each 
t —all workers are diversified; and (ii) <p*(/) = 0 or 1 for each t —all 
workers are specialized. A condition necessary for pure diversifica¬ 
tion is that for all worker types t. 


Sr = ?ii( ) - 27 i:i(-)<p /(0 + I'iKOl </:w/(v)rfv < 0 


where the left-hand side is evaluated for <p*(f) satisfying (5) as an 
equality. Equation (fi) could fail because of increasing marginal prod- 
uc t of either cp or 7'p in individual production, in which case some 
degree of specialization is implied. 


III. Optimal Assignments 

In this section, the solution to the optimal assignment problem is 
(haracteri/ed. F 01 the sake of brevity, attention is confined to the 
purely diversified and specialized cases, which are pursued in that 
order. The manner in which time allocation, individual output, and 
wage rates vary across worker types is readily established. Following 
that, whethei optimal assignments can be tharacteri/ed in terms of 
the pattern of comparative advantage is examined. In order for this 
idea to have any content, there must be clear definitions of compara¬ 
tive advantage, an assignment, and what it means for comparative 
advantage to dictate that assignment. l"he definitions employed here 
ate as follows; Relative to a worker of type t l , a worker of type /° has a 
<mnjxitahw advantage at task a if and only if t[',//p > t'Jtp. Assignment 
refers to time allocated to task a, cp(-). Finally, an assignment cp(-) is 
said to follow the pattern of comparative advantage if and only if, 
whenever two worker types are compared, the type having a com¬ 
parative advantage at task a is optimally allocated more time perform¬ 
ing task a. 

Given these definitions, it is demonstrated that, except for a razor’s- 
edge case, for each worket type t there are worker types having com¬ 
parative advantage (disadvantage) at task a who are assigned less 
(more) time performing that task.' 1 

This is a strong result, and the reader may wish to quarrel with the 
definition of comparative advantage and assignment on which it is 
based. Yet the arguments applied below appear to be effective 
whenever comparative advantage and assignment are not defined so 
as to be identical. In brief, only under the rarest of circumstances is it 
possible to predict the pattern of assignment given only summan 

r ' This result holds for l E ini fi, t E Bd il requires an obvious minor modification 
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information on the structure of skills. Some relevant information 
about skills and their interaction with technology is almost invariably 
lost. 1 ’ 


Optimal Diversification 

An optimal diversified time allocation ip*(f) solves 

<?i[«P*(0. L, T & ] - = 0 Vl£ (7) 

<p *(t) simply equates the marginal value of time in each activity. 

The manner in which ip*(t) varies across individuals is obtained 
from differentiation of (7) with respect to elements of t: 


and 


d<p* _ y 12 g {) 
dt<x ~'/l I 


( 8 ) 


if£l = _JL_ < ,). 

dtp q\ 1 


(9) 


Note that since the experiment involves comparing individual work¬ 
ers with different abilities, as opposed to changing the skills of a given 
worker, p, and 7'p remain constant. Also, though not strictly required 
by the second-order conditions, q\ 1 < 0 is assumed. 

Workers having greater facility in the performance of task ot will 
spend more or less time performing that task according to whether 
greater ability raises or lowers the marginal product of time; q\-> > 0 
will be taken to be the leading case. But those who are more able at 
task (3 always spend less time performing task a. This occurs because 
all workers utilize the same quantity of task |3 in individual produc¬ 
tion. Raising thus merely increases the opportunity cost of time 
spent on task a. 

Equations (9) and (10) permit construction of ip-constant loci in O. 
One such locus is depicted in figure 1 and labeled ip. In the figure, the 


’’ It is important to emphasize that the notion ot comparative advantage determining 
the structure ot assignment is only useful when it is not tautological, and that it certainly 
can be made tautological. For example, in the following subsection a set of skills on 
which 1 pit) is constant is identified. Requiring, as seems minimal, that comparative 
advantage be defined in the space of worker rhararteristics, it is possible to define 
comparative advantage by stating that all worker types whose skills lie in the ip-constatu 
set have identical comparative advantage. Alternatively, given the definition of com¬ 
parative advantage in terms of tjl$, it is in general possible to find some function of y, ip. 

and r„ that is constant as t a and ( B vary if and only if tjl v is fixed, and then define 
assignment to be measured by that function. Below 11 is shown that tp(-) is an example of 
such a function under an appropriate technological restriction. Both these approaches 
deny comparative advantage any operational content. 
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arrow emanating from ip represents the direction in which loci corre¬ 
sponding to greater <p*(/) lie. The slope of the ip-constant locus is 

dtp j _ 912 

dt a = o p 

The pattern of individual output across workers may also be ob¬ 
tained. Since individual output is t a , Tp] and Tp does not vary 

across workers, 


dq_ 

dt a 


< 7 t 


dip* 

dt a 


<?2 


5 0 


( 10 ) 


and 




(ID 


Workers who are more able at task a will produce more output simply 
because t a is an input and less to the extent that they might be al¬ 
located less time (if q t 2 < 0 ). Using (8), it can be shown that dqldl a < 0 
if and only if ( 3 / 3 ip)( — q\/q->) < 0, which is equivalent to l a being an 
inferior factor of production when Tp is held constant. 7 Thus dq/dt,, > 
0 is to be expected. On the other hand, since /p does not enter individ¬ 
ual production directly and increments to tp reduce <p*, workers who 
are more able at performing task (3 always produce less output. 

In figure 1 9-constant loci are labeled 9. Using (8)—( 11 ) it may be 
checked that for any t, the 9-constant locus through that point is 
always steeper than the corresponding <p-constant locus: 


dtp _ 912 

dt a tit)*- o p 


9* 

9,(d<p*/dtp) 


> 



rAp* - 0 


The argument is straightforward. An increment to requires some 
adjustment (of the same sign as 912) to Ip if ip is to be held fixed. But 
given this adjustment, 9 must be above its initial value because t a is 
greater and both ip and Tp are unchanged. Thus <3 must be raised still 
further to reduce <p sufficiently to return 9 to its original value. 

The impact of changes in t on wage rates is obtained as follows: The 
value of the marginal product of workers of type t is px'(-){q(-) + [ 1 — 
ip*( 0 ]tpp}, which will also be their equilibrium wage rale u>(/)- Using 


7 Using (H) 
dq 


= ?u <?2 ?ii - <Mia) < 0 «-» 


*5 


> 0 *-* 


(-■£■) 


<. 0 . 


I he last condition is that when 7 t , anti are held fixed, an increase in <p raises the 
marginal rate of substitution of <p for t„, which ts equivalent to inferiority of t„. 
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the envelope theorem, worker characteristics are valued at rates 


and 


dw 

34 


px'(-)q 2 


—— = /)*'(-)[ 1 - 

Of particular interest are comparisons of workers whose skills dif¬ 
fer only by scale. For any worker of type t, workers whose skills are the 
same up to scale have abilities £/, where £ is a scalar equal to unity for 
the “base” worker. In figure 1, such workers are arranged along the 
ray from the origin; workers of type t_ are the base. 

Differentiation of (7) gives 


dip* 


= TJ - 


1 , 


( 12 ) 


where rj s 4 «'/r/< 7 i is the elasticity of the tnaiginal product of <p with 
respect to 4 . Raising £ increases the value of time in both activities. 
Whether tj exceeds or falls short of unity determines whether the 
marginal value of time spent performing task a rises faster or slower 
than the value of time on task p. Though tj > 1 tan hold even for 
concave (/(■), and q ^> < 0 implies tj < 0, 0 < tj £ 1 is evidently the 
leading case. Thus workers who are proportionately more able will 
generally spend less time performing task a: d>p*/dC, < 0. 

Whether workers who are proportionately more able produce 
more output is less clear: 


d JL = „ 

c/£ h c/£ 


+ </;>4 § If 


though, of course, such workers must earn more: 


~ + [i - «p*(/)im» > a 


Further, whether earnings rise proportionately more or less than the 
increase in skills depends on whether q(-) is concave or convex in l a : 

1 dw | _ q,t a + [ 1 - <P*( 0 lP 4 > , _ , > 

W dl It. i q + [1 - 9*(0 ]p4 ^ ' h a § q - 

Now, does comparative advantage determine the pattern of assign¬ 
ment? Recall that one assignment follows the pattern of comparative 
advantage if and only if workers who have a comparative advantage at 
task a are allocated more time performing that task. 

A condition necessary and sufficient for <p*{t) to follow the pattern 



ABSOLUTE ADVANTAGE 


a 89 

of comparative advantage is -t) = 1 when q() is evaluated at [<p*( 1 ), t a , 
7 P ]. If and only if this condition holds do all workers having a given 
tjtp allocate their time the same way (eq. [12]). Moreover, higher tjt$ 
yields increased <p*(l) (t) = 1 implies q 12 > 0 in [ 8 ]). 

The sufficiency part of the argument is obvious from (12). Neces¬ 
sity provides a little extra information. If tp*(l) does not vary with 
proportional changes in t, -q — 1 is implied, which can be regarded as 
a differential equation in ^i(-): R 

<«?12(<P*. 1«. T») ~ T'p) = 0. 

The solution is of the form 

la. 7p) = Tp)t a 

for some function <y°(-). Integration over q> recovers the technology 

q(v< la. Tp) = ?'(<P, + q 2 (t a , 7 'p) 

for some q l (-) and q 2 (-). That all factors are necessary implies q 2 (-) = 0. 
For this technology (7) can be rewritten 

P- la 

in which case, the dependence of cp on t^lt a only is immediate and 
arises because marginal returns and costs are proportional to t a and l p , 
respectively. 

The lack of correspondence between optimal assignment and as¬ 
signment by comparative advantage can be restated as follows: For 
worker type I and associated time allocation <p*(l), if t| 5 ^ 1 there exist 
both (i) workers who have a comparative advantage at task a and 
spend less time performing that task and (ii) workers who have a 
comparative disadvantage at task a and spend more time performing 
it. Referring to figure 2, drawn from 0 < it) < 1, relative to workers of 
type 1 °, worker types in the northeast (southwest) shaded area have a 
comparative advantage (disadvantage) at task a but spend less (more) 
time performing that task. The cases q £ 0 and q > 1 can be treated 
in a similar fashion. 

Since assignment by comparative advantage holds if and only if ^(ip, 
la. T'p) = < 7 °(<p, Tp)t a , it is worth inquiring into whether this is a strong 
restriction. F irst of all, for any ^(<p, 1 „, T p ) is it possible to redefine units 
of t a so that q(-) is linear in the new units of („? That is, does there exist 
a function i(l„) with ? > 0, such that q(<p, t a , T p ) = q{ ip, T^)l a ? If this 
were so, appropriate choice of units would always yield comparative 


8 In this equation, varies with <„ to hold ljl$ fixed, but is constant across workers, 

as is <p (by hypothesis). 
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advantage determining assignment. Biackorby, Primont, and Russell 
(1978) show that such a change of units is possible if and only if q(-) 
has the property d log q/dtp = (?(tp, T p ) for some q(-) not depending on 
t a \ equivalently, q = q(y, T$)h(l a ). The function q(-) must be multiplica- 
tively separable. That multiplicative separability is a strong restriction 
is easily established. Briefly, let S be the set of production functions 
q(-) and S M be the subset of 5' that is multiplicative!)’ separable. Then it 
is easy to check that the complement of S™ in S (i.e., the set of produc¬ 
tion functions that are not multiplicatively separable) is open and 
dense in S. Roughly speaking, almost everything in 5 is not multiplica¬ 
tively separable. 9 

All this is not to say that comparative advantage is completely irrele¬ 
vant. Indeed, it is obvious that it is always possible to find two worker 

9 That the complement of S'" is dense in S is obvious. Openness is shown by checking 
that S M is closed (under sup norm) in S. 
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types whose optimal assignments agree with assignment by compara¬ 
tive advantage. In fact, recalling ( 8 ) and (9), for q i2 > 0 changes in 
comparative advantage induced purely by increments to just one of t a 
or /p generate adjustments that are always in accord with comparative 
advantage. This result obtains simply because changes in one of t a or 
tp induce relatively large changes in tjt$. The difficulty arises when 
both 4 , and (p vary. 

To see this point more clearly, write the equality of marginal prod¬ 
ucts of time across activities as 



If q\!t a is held fixed, an increase in tjl$ raises the left-hand side of (13). 
This effect, which may be called the comparative advantage effect 
(GA), always works in favor of increasing <p. But for given tjtp, an 
increase in l a either raises, leaves constant, or lowers the factor q\!t a 
depending on whether t) exceeds, equals, or falls short of unity. This 
effect, which can be labeled the absolute advantage effect (AA), can 
thus imply either greater or smaller <p depending on the size of t) 
relative to unity and the direction in which t a changes. Therefore, 
whether a given comparison of two worker types generates differ¬ 
ences in assignments in accord with assignment by comparative ad¬ 
vantage depends on the relative size of the GA and AA effects. For 
example, when worker types t° and I 1 are compared in figure 2 , op¬ 
timal assignments correspond to assignments by comparative advan¬ 
tage. However, for workers of type r (having the same comparative 
advantage as t 1 but more able), relative to the comparison of types f (> 
and t l , the comparison of (° to ( 2 has an identical GA ef fect but a much 
stronger AA effect owing to the larger change in l a . Consequently the 
AA effect dominates and the comparison of types t° and ( 2 yields 
disagreement with assignment by comparative advantage. Loosely 
speaking, this larger AA effect is the reason the shaded region fans 
out to the right and left. That is, the tp-consiant curve is the locus of 
vectors t for which the AA and CA effects exactly cancel. When the 
AA effect is larger, a more substantial CA effect is required to offset 
it. Overall, comparative advantage will tend to be a poor proxy for 
optimal assignment when differences in absolute advantage are large 
relative to differences in comparative advantage . 10 


0 The specific argument utilized to prove the result on the relationship between 
comparative advantage and optimal assignment hinges on the manner in which those 
terms are defined. But a more general form of the argument indicates that apart from 
the tautological methods mentioned in n. 6, construction of measures of comparative 
advantage and assignment for which comparative advantage determines (or completely 
summarizes) assignment will be unlikely to meet wilh success. The general argument is 
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Optimal Specialization 

When the optimal assignment involves pure specialization, the situa¬ 
tion is similar to the diversified case, but a good deal more straightfor¬ 
ward. The discussion is therefore brief. 

For all workers of type t, either <p*(/) = 1 or ip*(l) = 0, depending 
on the sign of <]{ 1, t„, Tp) - tp/j., where g, taking into account that ip*(Z) 
/ 0 implies ip *(t) = 1, is as above and q(-) is the individual output of a 
worker of type t who spends all of his time performing task a. The 
second term is the value of the gain in the additional output of other 
workers occasioned when the type / worker performs only task (3, 
generating the public input. 

Worker types / for which the firm is indifferent about whether they 
perform task o or 3 aie those for which 

q( 1, I a . 7'fj) - tpp = o. (14) 

The set of such worker types is labeled ip in figure 3. The slope of ip is 
qJp i > 0. The curvature of <p depends only on cp>- 2 - Figure 3 is drawn 
f or ij ) > < 0. 

Since </■> > 0 and p. > 0, all workers to the south of <p perform only 
task a, the rest being assigned to $. 

Turning to individual output, only those assigned to task a produce 
positive output. Among that gioup, those who are more able at task a 
produce more, while ability at task (3 has no influence on output 
because it has no influence on lime allocation. The ^-constant loci are 
vntical lines in o>, with loci corresponding to greater output lying to 
the t ight. Wages are simply the marginal value of either own output 
ot task (3: px'q ot px'tpp.. 

As above, the most interesting issue is whether time allocation ran 
be predicted solely from the pattern of comparative advantage in 
terms of workers’ skills. To determine whether this is the case, again 


as follows l ft (. — r(t) Ik* a measure of comparative advantage = Ijtp above) 

having level sets e £(C) = {/ E “ (7} Smniaily, let A = n{t) f»e an index of 

assignment (*f[/) 3 ip*[/J above) with level sets riUA) - j/ G ll\a(t) - A }; assume at least 
one element of each of Vr and V« is always non?ero Then comparative advantage 
completely determines assignment if and only if for ail {C, A) such that %(C) n ¥ 
{0}> ^((^) « &1{A) Put this way if is immediate that the restriction on a(t) tequired to 
achieve coriespondence with «(/) is «(/) = /|r(/)|, where/' has one sign. In the analysis 
above is a monolonically iih reasmg function of tji$ when r/() — q\-)K% More 

generally, some very strong restriction cm q{-) will he required provided, as is minimal 
for a{() to make any sense at all, the dependence of u(l) on t operates at least in part 
through g(-) and/or vp*(/). For example, retaining c(f) = tjt$ and letting (as is illustrative 
hut not very sensible) a(f) ~ </{<?*(/), V'p] implies a differential equation tor q(-) having 
solution q = where [ /i < a t < 1 , a-, — a t /( 2«j — I) > 1, and h f > 0. 
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parameterize skills by £ and differentiate q — taking £ = 1 to be a 
worker type such that q — = 0 ( t° in fig. 3 is such a type). Then 

— (<? - p.£t 0 ) = q 2 t a - ptp 

= - q at < u 

= q(!fq 2 - l)gO, 

depending on whether q is concave, linear, or convex in /„. That is, tp is 
a ray through the origin in figure 3 if and only if 

9(1, «a, tp) = q'{ 1, Tp)t u . ( 15 ) 

In this case, workers having a comparative advantage at task a per¬ 
form at least as much of that task as do the base group. 11 Assignment 
by comparative advantage and optimal assignment never conflict. 
Note that (15) is identical (for <p = 1) to that which was both necessary 
and sufficient for the corresponding result in the previous subsection. 

When q(-) is not linear in t a , the obvious results are that: (1) for all 
worker types performing task a and possessing comparative advan¬ 
tage ( tg/ifj ) < < (t a /tp) in figure 3, there are other worker types that 

have a comparative advantage at task a but perform only task 3; and 
(2) for all worker types performing only task p and possessing com¬ 
parative advantage satisfying (t„/tp) < l a /lp < (t„/t 0 ), there are workers 
who have a comparative advantage at task p but who perform only 
ta-k a. 


IV. Discussion 

While the results above are derived for an arguably special situation, 
the same principles hold in a wide range of circumstances. The pur¬ 
pose of this section is to examine a few diverse cases and mention two 
related points. 

First, the discussion of market equilibrium was conducted as if the 
firm in question were the only variety feasible in the economy or, 
alternatively, as if only one homogeneous good were produced. Then 
w(-) was determined within the industry. It is possible to carry out the 
analysis for a small industry taking w(-) as exogenous and allowing 
variations in product price to limit entry. The assignment results are 
unchanged by this modification, but in general l(t) > 0 only on to C fl 

11 The phrase “at least as much as" appears in the result rather than “strictly more” 
I>erause <p* takes on only two values in the specialized case. 
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(strictly). The set of worker types that will be attracted to the industry 
can be determined from examination of the set of workers who are 
indifferent between locations—those t satisfying px'[q +■ (1 — (p*)p.< p J 
= w(t) in the diversified case and f>x' ■ ma \(q, pip) = w(t) under 
specialization. In general, unless ie( ) is restricted by more than mono¬ 
tonicity, it does not appear possible to obtain any very useful results 
on the shape of w (equivalently, the nature of sorting into this indus¬ 
try). Work on a two-sector model wherein w(-) is determined endoge¬ 
nously appears more promising. 

A second point is related to the hedonic approach to the determina¬ 
tion of wages. Therein, to obtain determinate job-worker matches it is 
necessary to assume that worker characteristics cannot be "un¬ 
bundled.” In general, the implicit price of a characteristic then de¬ 
pends on the quantity purchased. The framework studied above is 
one in which the unbundling assumption is obtained from the under¬ 
lying structure of production. The asymmetric treatment of tasks a 
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and P generates iv(-), which is nonlinear in t a and linear in t$. More 
generally, when does not aggregate the linearly, u>(-) will be 
nonlinear in as well. 

The ideas in this essay have applicability well beyond the specific 
instance studied above. To illustrate, consider a two-person house¬ 
hold (<° and t 1 ) whose household output (0 is 

Q = ?[<P«°), C Y] + qW (<'), tL n, 

where ip(-) is time spent on household activities, l a is ability to perform 
them, and F is the value of purchased goods, the public (or at least 
partially public) good nature of which provides the rationale for fam¬ 
ily formation. Let be skills used in the market and their rental 
rate per efficiency unit in units of the purchased good. Then 

y = Mn - + [i - *(/')]/*>}• 

It follows that, focusing on the diversified case, the condition deter¬ 
mining tp*(f) is, for t = t°, t', 

9i[<P*(<), ta, *1 - = 0, 

where p s t° a , K] + tL, F]}. t he analysis proceeds 

exactly as above. 

On a different note, consider a heterogeneous firm model. 1 ' 2 For 
simplicity suppose each firm has a fixed labor force composed of 
identical (within the firm) workers; heterogeneous entrepreneurs and 
homogeneous (across firms) workers would do equally well, but the 
analysis is more cumbersome. Suppose task P is the production of an 
intermediate good used in final production, requiring task a and 
time. The intermediate good may be bought or sold at price p.. Aggre¬ 
gate skills within the tth firm are t' = (4, $). Given the work force, 
profit maximization for firm i is equivalent to 

max q[ip\ 4. B'] - p/‘, 

where IV = £0 - tp') + /' is the quantity (B) of the intermediate good 
used in production, consisting of own production <^(1 - ip) plus the 
quantity purchased, /. Here there are no public-good-type gains to 
cooperation. Again focusing on the diversified case, the necessary 
conditions for a profit maximum for firm i are q i — q<,t$ — 0 and q?> = 
Combining these yields 

?i[«P*<0. 4, fl'J - V* = 0. 

Again the analysis proceeds as before. 


12 1 his setting can accommodate eithei free or restricted entrs 
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The primary attribute of these examples that allows them to oper¬ 
ate in a manner formally identical to the problem described above is 
that the efficiency price of is independent of %. While convenient, 
this is not at ail necessary for the kind of results obtained. For ex¬ 
ample, even in Robinson Crusoe-type production q[^t a , (1 - ip)tp], 
where time arid skill interact multiplicatively, time allocation that 
matches the cross-island pattern of comparative advantage is optimal 
if and only if q(-) is homothetic. 

Finally, consider an international trade setting in which each of two 
(a and b) countries is endowed with L (normalized to unity) homoge¬ 
neous workers, each having skills t. Quantities of two goods a and 3, 
CK, and (Ip, are produced according to = q a (ipt a ) and Cp = ^[(1 - 
vpgpj. The proportion of the work force specialized to production of 
good a is tp; the rest produce 3 . For output prices p = (/>„, p$), the 
value of output is 

v(tf, t, p) = pj/MU + MpK 1 ~ ‘Pffpl- 

When </„(•) and r/p( ) are linear, the textbook Ricardian model is ob¬ 
tained. One country will specialize (0 in the production of good a, b in 
3) if and only if C'Jfy > t*//p. Workers are assigned on the basis of 
comparative advantage. However, ii q„(-) is convex and q$(‘) linear, 
countries may specialize even if £//£ = ( J'/lp (Melvin 1969). The same 
result can occur il country a has a larger labor force or one that has an 
absolute advantage in the performance of task a. Similar arguments 
imply (hat countries may or may not specialize and trade on the basis 
of relative fac tor endowment as in the Heckschet-Ohlin model. 11 


V. Summary and Conclusions 

This essay addressed the question how optimal assignments depend 
on the endowed traits ol economic agents. This question is a very 
general one, which was approached here in the specific setting of 
assigning workers to tasks within a competitive firm. It was shown, 
however, that even the restricted structure imposed to analyze the 
firm was rich enough to be reinterpreted to yield models of lime 
allocation within the family, heterogeneous firms, Robinson Crusoe 
island economies, and international trade with increasing returns. 
The most interesting issue in all these cases was whether it is possible 
to predict the pattern ol allocation solely on the basis of summar) 
information about the structure of endowed traits. The Principle o 
Comparative Advantage, when it holds, does exactly that. 


'* Markusen (1983) provides a survey of these and related causes of "no 
comparative-advantage” trade. 
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The model analyzed in detail—the intrafirm assignment prob¬ 
lem—involves the determination how the members of a heterogene¬ 
ous labor force are assigned to various tasks when the workers in the 
labor force are not perfect substitutes. The particular technology 
utilized to generate imperfect substitution was one in which final out¬ 
put depended on aggregate individual production, where the latter 
utilizes workers’ time performing one task and the services of a public 
input. The marginal product of time was assumed to be nonconstant, 
and skill in the performance of the task operated as a parameter in 
the production function. The public input was produced using indi¬ 
vidual time and ability in the execution of a second task, in the perfor¬ 
mance of which individual workers are highly substitutable. 

The competitive equilibrium assignment of workers’ time was char¬ 
acterized for two cases—pure diversification and pure specialization. 
In each, cross-worker comparisons of time allocation, individual out¬ 
put, and wages were undertaken. The most interesting issue was sim¬ 
ply, Do workers who have a comparative advantage at some task 
optimally spend more time performing it? It was shown that except 
for a razor’s-edge case, for any given worker there were other work¬ 
ers who both had a comparative advantage at one of the tasks and 
were optimally allocated less time at the task for which they had a 
comparative advantage. In general, knowledge of both skills and tech¬ 
nology is required to predict the pattern of time allocation. 

T his result held because the technological structure did not allow 
all relevant information about absolute skill levels to be summarized 
by skill ratios. When this is the case, individuals with similar relative 
skills can have very different assignments, because they are very dif¬ 
ferent absolutely. Thus the most talented individual may be made 
company president regardless of whether he has a comparative ad¬ 
vantage in this activity. 

The basic message of all this is simply that only under very special 
circumstances are summary measures of endowed trails able to con¬ 
tain all the information relevant to allocation. 
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Short-Run Analysis of Fiscal Policy in a 
Simple Perfect Foresight Model 


Kenneth L. Judd 

Northwestern Cmveisity 


This paper examines the short-run impact of current and future 
changes m fiscal pohc y on current investment in a simple representa¬ 
tive-agent, perfect foresight model. We show that anticipated invest¬ 
ment tax credits may depress current investment, as may an im¬ 
mediate income tax cut financed by future cuts in government 
expenditure. These impacts do result when we parameterize the 
model with current empirical estimates of the relevant parameters. 


I. Introduction 

Many recent papers have developed models to investigate the dy¬ 
namic evolution of the economy in order to analyze dynamic effects 
of fiscal and monetary policy. Blinder and Solow (1973), Tobin and 
Buiter (197fi), and Turriovsky (1977) studied dynamic versions of the 
Keynesian IS-LM model. The other major line of investigation has 
been the analysis of perfect foresight models (e.g.. Hall 1971; Brock 
and Turnovsky 1981; Abel and Blanchard 1983). The major strength 
of the perfect foresight framework is its foundation in standard mi¬ 
croeconomic principles and the ease of long-run analysis, whereas 
quantitative short-run analysis has been lacking in these models. 
While qualitative phase diagram analysis (e.g., as in Abel and Blan¬ 
chard) is instructive, it is incapable of determining the short-run re¬ 
sponse to many intertemporally complex policy shocks of interest. 
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This paper develops the quantitative short-run analysis of a perfect 
foresight model. In particular, I examine how an economy initially in 
a steady state responds to an unanticipated and arbitrarily complex 
change in current and future levels of taxation and spending. I show 
that short-run analysis can be accomplished with relative ease through 
the use of Laplace transforms, reducing the differential equations to 
linear algebraic equations and yielding a simple and intuitive formula 
for the short-run effects. This technical feature of the analysis is 
clearly of general interest and applicability. The major difference 
between this and most other linear models is that my coefficients are 
derived from basic parameters of taste and technology, allowing the 
examination of the quantitative significance of policy shocks and their 
sensitivity to these parameters. Also, I add a bond market, allowing 
examination of policy shocks that do not have continuous budget 
balance. 

The formulas developed below indicate the initial impact on invest¬ 
ment, consumption, and production due to balanced-budget changes 
in income taxation, investment tax credit changes, and government 
consumption. This analysis is then applied to two issues. Suppose that 
a permanent cut in the income tax rate is followed, after a lag, by a 
future spending cut large enough to satisfy the government’s dy¬ 
namic budget constraint. We find that this policy shock may initiate a 
phase of capital decurnulation and output decline that continues until 
government consumption declines, after which capital accumulates 
until it reaches the new, higher steady-state level. This possibility is 
realized when the elasticity of substitution between capital and labor 
and the intertemporal elasticity of substitution among goods are as¬ 
signed values considered representative ot the U.S. economy. This is 
only one example of how short-run movements may differ in a quan¬ 
titatively significant fashion from long-run movements, pointing out 
the need for tools in analyzing these short-run effects. 

The second polity issue addressed is the stimulative powers of the 
investment tax credit. We find that while tax credits today will stimu¬ 
late investment today, future tax credits may stimulate or depress 
investment today, depending on whether the sum of the pure rate of 
tune preference and the rate of depreciation is less than or exceeds 
the positive eigenvalue of the linearized equilibrium equations. In 
more intuitive terms this means future tax credits depress current 
investment in fast-adjusting economies, while they encourage current 
investment in slow-adjusting economies. 

The paper is organized as follows. Section II contains a description 
ot the basic model. Section III discusses a graphic analysis of one 
particular fiscal policy. In Section IV, the basic short-run quantitative 
analysis of perfect foresight models is developed. Section V applies 
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these results to a fiscal policy shock. Seciion VI summarizes the pa¬ 
per's main points. 


II. The Model 

Assume that we have an economy of a large fixed number of identi¬ 
cal, infinitely lived individuals. The common utility functional is as¬ 
sumed to be additively separable in time with a constant pure rate of 
time preference, p: 

C = ( e " r u[C(t)\dt, 

Jo 

where (,’(/) is consumption of die single good at time t and u is the 
instantaneous utility function. Let (3(0 = — u"(C)Clu'(C) denote the 
elasticity of maiginal utility, also called the coefficient of relative risk 
aversion. One unit of labor is supplied inelastically at all times l by 
each agent, for which he receives a wage of u'(/). An inelastic labor 
supply is assumed, so that we may focus on the techniques used here 
to deal with the dynamic problems. The case of elastic labor supply 
introduces several complications and is led lor a separate study (see 
Judd 1983). 

There are two perfectly substitutable assets in this economy, gov¬ 
ernment bonds and capital stock, each with the same net riskless rate 
of return. Let b'(k) be a standard neoclassical constant returns to scale 
production function giving output per cajnta in terms of the capital- 
labor ratio. It. Output can be used for consumption or investment. At l 
= 0, is the endowment of capital for each person. Capital depre¬ 
ciates at a constant rate of 8 > 0 and f(k) shall denote the net national 
product, that is, gross output minus depreciation. Elasticity of sub¬ 
stitution between capital and labor in the net production function is 
denoted by cr. To allow the use of differential techniques, we assume 
that u{i ) and /(It) are C~ functions. The value of outstanding debt in 
terms of consumption is denoted by b. 

We shall keep the institutional structure simple. Think of each 
agent owning his own firm, hiring labor, and paying himself a rental 
of i/.,(t) per unit of capital at t, gross of taxes, credits, and deprecia¬ 
tion. It is straightforward that the alternative assumption of value- 
maximizing firms would be equivalent (see Brock and Turnovsky 
[19811 or Abel and Blanchard [1983J for formal demonstrations of 
this). Since there will be no discussion of policies that are sensitive to 
institutional structure, we can use that fact and ignore the institu¬ 
tional detail that firms bring. The gross return on bonds at t will b< 
denoted ?//(()■ 

The government will play the usual role: at time t, it taxes capita 
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income net of depreciation at a proportional rate taxes labor 

income at a proportional rate of T/.(t), assesses a lump-sum tax of /(t) 
per capita, pays an investment tax credit on gross investment of 0(<) 
units of consumption per unit of investment, consumes g{t) units of 
output, pays interest on outstanding debt, and floats b(t) new bonds. 
The bonds are assumed to be continuously rolled over, allowing us to 
ignore effects due to the term structure of debt. The adjustments for 
consols will be noted. 

This model is consistent with two types of public consumption. 
First, the public consumption can be thought of as either public goods 
that do not affect the marginal rates of substitution among private 
goods or transfers to individuals who participate in neither the capital 
nor labor market. Both interpretations could be modeled formally by 
assuming that the private utility functional is additively separable in 
private and in such public expenditure. Therefore, W'hile there may 
be value to each taxpayer from public consumption or transfers to the 
poor, the level and path of such transfers do not affect the demand 
functions of the agents for their private goods. A second class of 
public expenditures consistent with this model are publicly provided 
private goods that are perfect substitutes for private consumption. 
Being perfect substitutes, their provision is equivalent to lump-sum 
transfers to taxpayers. Therefore, our model includes both classes of 
public goods. Let g be the public spending for goods that are addi¬ 
tively separable with respect to private consumption. Lump-sum 
transfers will represent those that are perfect substitutes for private 
consumption. With this formulation we can concentrate on purely 
fiscal policy issues while allowing two major classes of public expendi¬ 
tures. 

The representative agent will choose his consumption path, C(t), 
capital accumulation, k(t), and bond accumulation, b((), subject to the 
instantaneous budget constraint, taking the wage, rental, and tax 
rates as given: 

max f e~ p 'u[C(t)]dl 

(WMn Jo 

0) 

s.t. C + h + b = u<( 1 — T/) + [(r t — b)k + r p b]( 1 - t*) 

— I + <d(bk + A), 

m = 

(t ime arguments are suppressed when no ambiguity results.) We 
define 


q(t) = e p(t '’[(r,.. - 8)(1 - t K ) + 89Jw'[C(s)]ds, 

Jt 


(2) 
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where q(t) is the current marginal utility value of an extra unit of 
capital at time t. Along an optimum path, each individual is indiffer¬ 
ent between an extra 1 - 0(1) units of consumption and the extra 
future consumption that would result from an extra unit of invest¬ 
ment: 


[i - ewju'iaol = ?(')• 


(3) 


The arbitrage condition for investment in bonds is similar: 


u 


[f;(/)j = ' d* 1 ' -’*M'[av)K(.v)fl 

Jt 




(4) 


Since these equalities hold at all times !, we may conclude that 


p - 



O,) = 


Or - 8)(1 - 

1 



t k ) + &e - e 

- e 

h 

- e’ 


(5) 


where p — u’U) In what follows, r tt will he regarded as the function of 
>r. 0, 0, and implied by (5). 1 

We assume that the transversality conditions at infinity hold for 
both assets in ordei to ensure that p, </, and k remain bounded as 
I —» 


f/TC,) Inn q(t)k(i)t>~'“ = 0, litn p(t)b(t)e ‘ p ' = 0. 


( 6 ) 


This is a necessary condition for the agent's problem if «(•) is 
bounded, which is a harmless assumption here since the net produc¬ 
tion function is hounded (see Benveniste and Scheinkman 198^). In 
the case of bonds, the content of these conditions is most clear: the 
government is not allowed to play a Pond game with consumers; that 
is, it cannot succeed forever in paying off interest on old bonds by 
floating new bonds. 

To describe equilibrium, impose the equilibrium conditions 

» = F{k), (7a) 

w = f(k) - kj'(k), (7b) 

b = g + 0(8 h + k) - t K kf'(k) + br H (l - t a ) ^ 

- u[m - kf’(k)) - pi) 


1 Without any real loss of generality, we may assume 0 to be a C' function of time. 
That is unnecessary if one inlerprels all ihe foregoing as generalized functions and uses 
the operational calculus. 
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on (2) and the budget constraint. This yields the equilibrium equa¬ 
tions 


9 



(1 - r K )f(k) + 89 - 
1-0 


(8a) 


k = /<*> - r (r^r) ' * 


(8b) 


where u'[c(p)] = p defines c(p), which expresses consumption as a 
function of the marginal utility of consumption. These equations de¬ 
scribe only the real activity of the economy, the path of bond holdings 
being determined as a residual obeying equation (7c). The transver- 
sality conditions ensure that 

0 < lim <?(/), lirn k(t) < =>c. (9) 

t~*nc /—♦•x. 


The pair of equations (8) describe the equilibrium of our economy at 
any t such that q and k are differentiable. To determine the system’s 
behavior at points where q or k may not be differentiable, 1 impose the 
equilibrium conditions on (2), yielding 

-» - f (10) 

which shows that q{t) is a continuous function of time. The system of 
relations given by equations (8) and (10) and the inequality (9) will 
describe the general equilibrium of our economy. 

Since there are many alternative models available lor studying 
short-run effects in perfect foresight models, some preferable on 
grounds of realism and/or tractability, we should note reasons for 
examining this one. While 2-period overlapping generations models 
(e.g., Diamond 1970) are good for understanding the qualitative fea¬ 
tures of perfect foresight analysis, they are far loo rigid for meaning¬ 
ful quantitative short-run analysis. For purposes of application, a pe¬ 
riod in such a model would be on the order of 25-30 years, far longer 
than what would be realistically regarded as the short run. The Cass- 
Yaari (1967) model of continuous-time overlapping generations is not 
analytically tractable. Because of the inherent errors, numerical simu¬ 
lation of the Cass-Yaari model, as in Auerbach and Kotlikoff (1983), 
is limited to examination of large changes in the parameters, whereas 
the analytical approach used here is capable of computing marginal 
effects of changes in the parameters. These may be substantially dif¬ 
ferent because of the nonlinearities of such models. Since legislative 
deliberations often concern relatively small changes, the ability to 
compute marginal effects is desirable. Also, this choice avoids the 
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non uniqueness problems that plague overlapping generations models 
and render comparative dynamic exercises invalid. 

Although it is absurd to assume that any person has an infinite life, 
it is also an open question whether this is a bad approximation. The 
work of Kotlikoff and Summers (1981) indicates that substantial 
amounts of wealth are held for bequest purposes, in which case the 
true economic agent would consist of several generations of a family, 
having a life in excess of the roughly 50-year economic life span of an 
individual person. 2 The neoclassical growth model is not used because 
it models savings as a function of current rate of return on capital, 
rendering it incapable of analyzing anticipation effects, which are 
very important in our analysis and for many of the arguments made 
by policymakers and analysts. In summary, the choice of the infinite- 
life version was made Itecause (i) meaningful short-run analysis is 
tractable, yielding simple and intuitive formulae, where sensitivity to 
basic parameters is easily determined, and (ii) empiiical evidence indi¬ 
cates that it is not an absurd approximation. 

These reasons are basically ones of theoretical soundness and real¬ 
ism, hut not of demonstrated empirical validity. Nothing defensible 
on that issue will he said here. However, the analysis below will still be 
of interest to those who reject this full-employment approach to mac¬ 
roeconomic analysis since (his model is close in spirit to the beliefs of 
some policymakers. We may test their arguments for logical consis¬ 
tency. For example, some policymakers argue that if taxes are cut 
immediately to be followed later by a spending cut, the lax cut will 
stimulate capital formation in spite of the temporary deficit. Can they 
believe in their perfectly competitive model and believe that there are 
no substantial short-run consequences of the resulting deficit for capi¬ 
tal accumulation and production? Let us now move to a graphic anal¬ 
ysis of this issue in our model. This will serve to illuminate the basic 
features of this model, illustrate the limitations of graphic analysis, 
and demonstrate how short-run effects may differ from long-run 
effects. 

III. Graphic Analysis 

One can partially analyze the impacts of policy changes on the equilib¬ 
rium in a graphic fashion using phase diagrams. 1 In this section I 
analyze the short-run consequences of an income tax cut followed 
with a lag by a cut in government consumption large enough to bal- 

' The relevant open analytical question is how long the economic life of an economi 
agent has to tie before the Cass-Yaari model is approximated well by the infinite-lif 
model. 

* Other examples of such graphical analysis can be found in Abel and Blanchar 
(1983). 
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ante the government’s dynamic budget. In particular, 1 determine 
whether the deficits incurred in the short run reduce capital forma¬ 
tion. For the purpose of this example, I assume that there is no 
investment tax credit and that both capital and labor incomes are 
taxed at the rate t, making the graphic analysis more transparent 
without losing the essential points. In this section I examine the more 
interesting case where all government expenditure is public-goods 
consumption, represented by g in the equilibrium equations above. 

Equations ( 8 ) can be represented qualitatively by a phase diagram 
as in figure 1. Note that this phase diagram is in (c, k) space instead of 
(q, k) space. Since labor is inelastically supplied, this representation is 
equally simple and clearer. It is derived from equations ( 8 ) by means 
ol the equality q = u'(c), which holds since there is no investment tax 
credit. The vertical c = 0 curve is the locus in (c, k) space where 
consumption is stationary and is derived from ( 8 a); the upwardly 
sloped k = 0 line represents the locus where investment is stationary, 
being derived from ( 8 b). Within each of the four regions defined by 
these curves, the arrows indicate the general movement of the system 
described by equations ( 8 ). This system displays a saddle-point struc¬ 
ture with a stable and an unstable manifold, the former being the set 
of points from which the system converges to the steady state, point 
d. Note that a change in t will affect only the c = 0 locus and that 
changes in g affect only the k - 0 curve. 

With these tools in hand, we can analyze the effects of a tax cut 
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Kk; 2 —Analysis of a tax cut followed by an cxpendituie cut of (1C 


followed with a lag by an expenditure cut sufficient to balance the 
dynamic budget of the government. This is displayed in figure 2. In a 
high-T and a high-g regime, the phase diagram is described by the two 
stationary loci intersecting at A, the corresponding steady slate, 
whereas the Iow-t and low-g regime has steady state C. If there were 
no lag between cuts in t and g, stability implies that consumption 
would jump vertically to that point on the stable manifold of the 
system with steady state C. Suppose that point is D and that the new 
stable manifold is the curve through D and C. From D, the economy 
would converge to C along DC. 

Now suppose that there is a lag between the cut in t and the cut in g 
of T units of time. Then, in the time before the cut in g, the economy 
is governed by the AB-BC system with steady state at B: since t is cut, 
the c = 0 locus moves right, but the k = 0 locus is unchanged since g is 
unchanged initially. If T is small, then continuity in T implies that the 
initial consumption level must be close to D, which is in the northwest 
sector of the AB-BC phase diagram where movement is northwest- 
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erly. Equation (10) implies that in equilibrium there are no jumps in c 
at t = T. Therefore, the system between t = 0 and t = T must move 
from somewhere on the AD line segment to a point on DC. From this 
we may conclude that at t = 0 , the economy jumps from A to a point 
between A and D, say E, and that it then moves northwesterly and hits 
a point on the line through DC at t — T. (1'he initial increase in 
consumption appears to be less, because of the positive lag. This is not 
necessarily the case, because if T were greater the necessary cut in 
spending would also be greater, pushing the k = 0 locus upward.) For 
larger T, the economy may initially jump to a point such as F\ but 
since it must be on DC at T, the economy must go through some phase 
of capital decumulation prior to the spending cuts since DC passes 
above A. The stable manifold around C may pass below A. In such an 
economy, the marginal propensity to save out of the tax cut would 
exceed unity if the tax and spending cuts W'ere simultaneous, a fea¬ 
ture generally considered implausible. However, we cannot rule it 
out, and in this phase diagram we cannot determine which case holds 
for plausible production and utility functions. 

T his example illustrates the basic principles of the model in a trans¬ 
parent graphic fashion but also shows that such graphic analysis is 
inconclusive even in a simple case. We shall return to this example in 
Section V below after developing the necessary analytical tools. 


IV. Quantitative Analysis 

While the graphic analysis above was instructive, it was inconclusive in 
determining qualitative features of the equilibrium and would always 
be incapable of answering questions concerning the quantitative im¬ 
portance of these effects. To answer such questions we must use 
analytical techniques. We will concentrate on analyzing a simple per¬ 
turbation of a steady stale, though the analysis can be easily adjusted 
when the initial condition is not the steady stale. Suppose that the 
government has been taxing at constant rates t k and 17 ., granting an 
investment tax credit at a constant rate 0 , and consuming goods at a 
constant rate g, assessing a constant lump-sum tax of /, and that the 
economy has reached the corresponding steady state, with bonds at 
that level consistent with budget balance. Next suppose that at t = 0 
the government has announced that at ( ? 0 , will be th K (t) greater, 
T/. will be greater, the lump-sum tax will be (.1(1) greater, the 

investment tax credit will be ez (t) greater, and government consump¬ 
tion will be eg(<) greater. To continue, it is necessary to make the 
following constancy assumption: 

hji, h /, g, l, and z are all eventually constant functions of time. 
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This assumption is necessary to ensure the existence of a new steady 
state but is not an important limitation since the date of" eventual 
constancy is arbitrarily distant. 

For any fixed e, equilibrium is the solution to the differential equa¬ 
tions: 

j »4> - LLI3. + «*»}. (,1a) 

k = f ~ fi-_ -- e - i- e - z - (7) ] - [ z + e ^ )] (lib) 

with boundary conditions |lim ( _*A(/)j < k(0) = A 0 . We shall denote 

the solutions k(t, e) and </(/. e), making explicit the dependence on e. 
Since the economy is initially at the 6 = 0 steady state, the government 
announcement is essentially that e has been increased. We would like 
to know the impact of this change in e on the critical variables at 
future times; that is, we want to know the values of 


On. 

c')€ 


a, o) = 


±( 

to \ 


M / 


*• ?■ 


Since u(r) and f(k) are C~, these derivatives exist (see Oniki 1973). 
Differentiation of the equilibrium system yields a linear differential 
equation in the variables k t , : 



Since we are initially in a steady state, the matrix in (12) is actually a 
constant matrix, J. the Jacobian of" the equilibrium diff erential equa¬ 
tion. Therefore, the system in (12) is linear with constant coefficients 
and we can solve it with Laplace transforms. The Laplace transform 
of a function f(t) defined for positive t is another function F(s) 
defined for sufficiently large positives, where F"(s) = fit e~ '‘f(t)dt. Let 
(f € (.s), K, (,v), 7/ a (.v), Z(s), and G(s) be the Laplace transforms of <y e (t), 
k f (t), z, and g, respectively. These Laplace transforms satisfy the 
Laplace transform of (12): 


sQ t (s)l .r(J*(j)" 

J i *:.(i). 




PERFECT FORESIGHT MODEL 


3«9 


Solving for Q*(.s) and K f (s) yields 


rG.(*)' 

K t (s)_ 


(si -jr l 


- (p + S)Z(s)] + q t ( 0) 


~G(s) - 


c'qZ(s) 
(1 ~ 6) 2 


' (14) 


We need to find the value of ^ t (0), the initial change in the marginal 
utility value of an extra unit of capital. This is tied down by invoking 
the stability condition. We know from stability that q((, e) and k(t, e) are 
bounded in t for any fixed e; we need to prove that k e (t, 0) and q t (t, 0) 
are also bounded. (The proof of lemma 1 is in the App.) 

Lemma 1. k f (t, 0) and q t (t, 0) are bounded in t. 

Let p., X be the eigenvalues of J. They are given by the formula 




l ± Ji + 


4(1 - T*)M, 


3(1 - ejGTfl* 


(15a) 


where/' is the steady-state marginal product of capital, evaluated at 
the steady-state capital stock, k", both defined by 


= P( 1 ®) 80 

1 ~ T K 


(15b) 


and 0 r is the share of net output allocated to private consumption, df. 
Clearly, jj. > 0 > X if t*, 0 < 1. If (/' — p)//', the net effective capital 
income tax rate, is positive, then |x > /' > p. This fact will play a key 
role in understanding the short-run impacts of policy stocks. Lemma 
1 implies that K t (s) is bounded for all s > 0. In particular, A' t (p.) is 
bounded, implying that the jump in the shadow value of capital at t = 
0 is 


[(P + 8 - p)Z(p) - //*(p)/'] + -^-fi(p). (16) 

q I — n c 

Combining (14) and (16), we have the solution for K f (s) and Q t (s). 
Having solved for the Laplace transforms of the adjustment paths of q 
arid k, we can now use them to determine the impact of the shocks on 
economic variables and derive an expression for the government’s 
dynamic budget constraint. 

A. Impact on Consumption and Investment at t = 0 

I he solutions above determine the economy’s response to a change in 
t in terms of the Laplace transforms of the policy changes. However, 
it is possible to compute the values of k f and q t and their time deriva- 
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tives at t = 0 without solving for the inverse Laplace transforms of K f 
and P,. The crucial fact about Laplace transforms is 

/(0) = lim sF(s), (17) 

if F(s) is the Laplace transform of/(<). 

Theorem 1. The initial impact on investment of the announced 
changes is 

*.(<>) = p ( ' i -"e) N0) + (p + 8 ~ + m-gm - g(0). 

(18) 

Proof. Follows directly from (14), (16), (17), and L’Hospital’s rule. 

From the formula given in theorem 1 lor the impact on investment, 
we can note several aspects of the relationship between fiscal policy 
and capital formation. First, an increase in government expenditure 
at t — 0, g(0), causes a dollar for dollar decrease in capital formation. 
In a lif e-cycle model such as this one, a consumer endeavors to have a 
steady level of consumption; hence a momentary spurt in govern¬ 
ment consumption of g(0) at / = 0 will be satisfied by less capital 
accumulation. 

Second, the impact of future government consumption on capital 
formation is expressed in the term pG(p), that is, discount the change 
in government spending at the rate p and multiply the result by p. To 
get some intuition for this, let us first examine a plausible but false 
procedure. One may have argued that the appropriate measure of 
future government consumption on investment would be p(7(p)—lake 
the discounted value of the expenditures, G(p), as their capitalized 
value and note that a savings flow of pG(p) would finance the expendi¬ 
tures at the existing real net rate of interest. This would be an individ¬ 
ual's response if interest rales were unaffected. However, interest 
rates will respond to these policy changes. Equation (18) shows that 
this procedure is valid for general equilibrium calculations with the 
proper discount rate being p, not p. This fact points out the impor¬ 
tance of general equilibrium analysis versus partial equilibrium analy¬ 
sis, since the positive eigenvalue is generally much larger than the 
pure rate of time preference for realistic values of the crucial parame¬ 
ters. Since p > p, pG(p) puts more weight on changes in government 
consumption in the near term relative to distant future changes than 
pG(p) does; that is, the naive partial equilibrium approach overesti¬ 
mates the impact of government consumption in the distant f uture on 
investment today and underestimates the impact of such expendi¬ 
tures in the immediate future. In particular, we see that the anticipa¬ 
tion effects of future policy changes decay rapidly relative to the 
utility discount rate as the date of the change becomes more distant. 
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One aspect of (18) may initially appear to be puzzling: an increase 
in future government consumption, with current government con¬ 
sumption held constant, encourages investment today. Since this term 
indicates the impact on investment today with the capital income tax 
rate held constant, the spending is implicitly being financed by lump¬ 
sum taxes. Because of the bond market, the timing of these lump-sum 
taxes is immaterial, but their existence is essential for the government 
to remain within its budget constraint. Therefore, with income taxes 
held constant, extra spending will cause pG(p) to be positive, causing 
investment to increase because of the consumers’ needs to finance the 
extra lump-sum taxes. 

Third, the impact of future and present taxation on investment 
today is summed up in the first term. Again, note that the appropriate 
discount rate is p, as expressed in f/*(p). Again, since p > p, the 
anticipation effect of future taxes on current investment is much 
smaller than one may have expected. This expression has an inter¬ 
esting interpretation. [p/(l — t)]///,(p) is the change in revenue dis¬ 
counted at p if the capital stock does not change, expressed as a 
fraction of the capital stock. Hence the change in investment is this 
capitalization factor times consumption divided by the elasticity of 
marginal utility, yielding a decomposition of the change in investment 
into multiplicative factors representing consumption, curvature of 
utility, and the value of the tax change capitalized at p. This expres¬ 
sion for the impact on capital formation is useful for comparative 
dynamic analysis and highlights two important points. First, if p is 
large, the investment response to future tax changes is sluggish, since 
high curvature in the utility function indicates a desire for an even 
consumption stream and little taste for extreme changes in consump¬ 
tion to finance volatile investment plans. Second, investment today 
responds much more to tax changes today and in the near future than 
it does to more distant tax changes. 

In examining the impact of the investment tax credit changes we 
see that the role of timing is more crucial, for z(0), the extra tax credit 
today, plays an important role, as well as Z(p). Clearly, as z(0) in¬ 
creases, so does investment at t = 0. This is expected since z(0) is the 
change in the initial subsidy to the initial investment. The impact of 
the rest of the tax credit on current investment is ambiguous. Future 
tax credit policy changes current investment by (p + 8 - p)Z(p)c/p( 1 
- 0). Even if z(t) 3* 0, the sign of this is ambiguous—positive for slow- 
adjusting economies, p + 8 > p, and negative for fast-adjusting econ¬ 
omies, p -l- 8 < p. Fast-adjusting economies are associated with less 
concave utility functions. When faced with smaller future tax credits, 
such investors will invest more today to take advantage of the current 
short-lived tax credits, and when the tax credits are less generous in 
the future they will just as rapidly decumulate, treating today’s tax 



j s JOURNAL OF POLITICAL ECONOMY 

credit as a subsidy to future consumption. For people with more 
concave utility functions, such fluctuations in consumption are dis¬ 
liked and future tax credits are an inducement for investment today, 
since more investment today leads to more depreciation in the future, 
the replacement of which is subsidized by future tax credits. This 
result differs from partial equilibrium analysis (e.g., Abel 1982), 
which argues that investment tax credits are generally stimulative 
whether they are permanent or temporary. These analyses do not 
take into account interest rate movements, ostensibly because the ef¬ 
fects are trivial. Assuming that there would be no interest effects is 
odd in this context since investment tax credit policies are argued to 
have a macroeconomically significant impact on investment. We see 
that when we allow interest rate effects, the true general equilibrium 
result may be different from that indicated by the partial equilibrium 
analysis. Also reflected in (18) is the fact that whatever the impact of 
policy changes on investment today, that impact is magnified by the 
current investment tax credit. 


B. Balanced Budget Condition 


Next, we compute the relationship that must exist between the 
changes in taxation and expenditure due to the government’s budget 
constraint. The differential equation governing bonds is 

!>=■£ + Zg(t) + > ri {I - T K )h + [0 + ez(OP* + k) (19) 

- It* + di K (t)]kf\k) - [t, + di,Xi)\\J(k) - k('{k)\ - 1 - l(t). ‘ 

The government's dynamic budget constraint requires the present 
value of its obligations and expenditures to equal the present value of 
its revenues, where the appropriate discount rate is the after-tax rate 
of return . 1 Differentiating that constraint with respect to €, using the 
definition of r /t . equation (20), and the fact that b is zero in the initial 
steady state, we find theorem 2 . 

Theorem 2. Budget balance implies the following constraint on the 
policy shocks: 


0 = C.( p) 



fc(Q) 

<7 


+ pZ(p) - z(0) 


b 


- r h K t (p)(f + kf) - kf'H K { p) + v.kf'Ktip) (20) 

- Ih.mf - kf) - L( p) + 8(p + 5 )K t (f>) + Z(p)8*. 


H This can be derived from the consumers’ budget constraints and their transversality 
conditions, as in Brock and Turnovsky (1981). 
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where H^(s) End L(s) are the Laplace transforms of hi and l, respec¬ 
tively. 

If b, the initial stock of bonds, is zero, (20) asserts extra revenue 
equals extra spending discounted at the rate p, the steady-state real 
net return. With b > 0, the real rate of interest that must be paid on 
bonds when they are rolled over changes, the net discounted value of 
the altered interest bill per unit of existing debt being the coefficient 
of b in (21). With a nontrivial term structure, this term would be 
different and would disappear if bonds were actually consols. In that 
case, the bearer may experience a capital gain or loss at f = 0 . 

V. Example: Cut Taxes, Then Spending 

In this section we apply the quantitative techniques of Section IV to 
examine the impacts of the fiscal policy shock discussed in Section III. 
If taxes are cut immediately and spending cut later, the long-run 
effect is clear: increased capital formation and output. However, the 
short-run effects of this policy change on capital formation are not 
clear because the revenue losses are not matched by cuts in govern¬ 
ment expenditure. The resulting deficit must be financed by govern¬ 
ment bonds. Of course, in the long run the government’s budget must 
be balanced, or more specifically, that must be the expectation if 
investors are to be willing to hold bonds today. That balancing can be 
accomplished by reducing government consumption, g, or decreasing 
lump-sum transfers to those who participate in the economy. To the 
extent that the budget will be balanced by reductions in transfers to 
workers and investors, the analysis is straightforward from the 
foregoing graphic analysis and equation (18): only the c = 0 locus will 
be af fected, and the economy will jump to the stable manifold associ¬ 
ated with the new tax rate, converging monotonically to the new 
steady state where consumption, income, and the capital stock are all 
greater. Therefore, in this section we will initially address the case 
where the government’s budget will be balanced by reductions in 
government consumption, g. The question we address is whether this 
unanticipated change in the financing and level of future government 
consumption will crowd out capital accumulation in the short run, 
contrary to the long-run increase in capital. 

As in Section III, we assume that the taxes on both labor and capital 
incomes are equal to t and the changes in these taxes are also identi¬ 
cal. This is not meant to be a precise description of the U.S. economy 
or an exhaustive study of the short-run impact of this type of policy 
change. Such a study would need to include an elastic labor supply 
and costs of adjustment, at least. To do all this is beyond the scope and 
space of one paper. Our focus here is to illustrate how the analysis 
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above can be applied to a particular issue and to demonstrate that 
these effects are not trivial in magnitude. The analysis will also indi¬ 
cate which parameters of taste and technology have a significant im¬ 
pact on the answers. The results of this section turn out to be largely 
unaffected by the initial level of bonds and the investment tax credit 
when they are assigned reasonable values, so both are set equal to 
zero. 

The government decision to cut the tax rate immediately and re¬ 
duce spending at some future date T > 0 can be modeled above by 
particular functional forms for g, h K , and h, : 


MO = MO = -1 


g(t) = 



t < T 
t 5* 7 


(21a) 

(21b) 


where y, the magnitude of the future cut in g, is unknown a priori. 
The value of y is determined by examining the balanced budget con¬ 
dition and is found to be 


y = /(A)f p7 


1 - t p - A. 3|UL 


1 _ _E- £_—i] 

l-xp-Xp-jx 


“ /(A)y, (22) 


where y denotes the spending cut as a proportion of net national 
product. 

One interesting index of this impact is the general equilibrium 
marginal propensity to save, that is, the portion of the extra dispos¬ 
able income at l = 0 saved by individuals in equilibrium, denoted by 
MPS. (This is to be distinguished from the individual marginal pro¬ 
pensity to save out of current income.) It is equal to 


MPS = -g- - ye'* 7 ' + 1. (23) 

If T > 0, capital accumulation begins at t = 0 if and only if MPS 
exceeds unity, since only then are there any extra savings left alter the 
deficit is financed. Standard differentiation exercises for MPS arc 
tedious and inconclusive; furthermore, we really do not care about 
derivatives at all parameter values, just at reasonable ones, and we 
want some idea of the magnitudes involved. Therefore, table 1 lists 
values of MPS over a wide range of values for 3, or, T, and t. The 
value of p is normalized to be 0.01, indicating that one period of time 
is that duration over which utility is discounted by 1 percent. To those 
who believe that the annual rate of discount is 4 percent, T equals the 
number of quarters between the tax cut and the spending cut. Casual 
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examination of national income accounts suggests that we take capital 
share to be 0.25 r ’ and government consumption to be 0.2 of net pro¬ 
duction. These are reasonable values, especially since MPS is insensi¬ 
tive to reasonable changes in these parameters compared to its sen¬ 
sitivity to cr and p. 

The elasticity of substitution, <r, has been estimated often with 
mixed results. We allow <t to range between 0.4 and 1.3. This range 
includes some of the low estimates from time-series analysis, the 
higher cross-sectional estimates, and the reconciled estimates of 
Berndt (see Berndt [1976] for a general discussion; also Nerlove 
[1967]; Lucas [1969]). 

fhe other major parameter is p. Two types of empirical analysis 
tan be used to guide us in choosing an appropriate range. First, we 
may use the macroeconomic literature that argues for P between 0.5 
and 6 (see Weber 1970, 1975; Grossman and Shiller 1981; Hansen 
and Singleton 1982, 1983). Second, the more disaggregated estima¬ 
tion of demand by Fillips (1978) also (ignoring the nonsensical result 
for "other services") implies a range of p from 0.5 to 6. We allow p to 
range between 0.2 and 10.0 in order to include at least part of the 
confidence intervals. 

From table 1, we may conclude several things. First, the magnitudes 
(AMPS indicate t fiat the effects on savings at t = 0 of this policy shock 
are neither negligible nor unrealistic. They also indicate that for most 
values of the parameters, capital will begin to decumulate at t = 0 if 
there is a fag between tax cuts and spending cuts. Second. MPS in¬ 
creases as T increases. This has an intuitive explanation: as the spend¬ 
ing cuts are pushed further into the future, their income effect on 
today’s consumption decreases, resulting in less consumption and 
more savings today. Equations (22) and (23) also show this since y 
grows at the rate p as 7' increases but is discounted at the rale p. in the 
expression for MPS. Third, as p is less, that is, the utility function is 
less concave, MPS increases. This, too, is easily explained: a more 
linear utility function cares more about total consumption than about 
the smoothness of the consumption path; therefore, the price effect 
of the cheaper f uture goods dominates, depressing current consump¬ 
tion and increasing savings. Fourth, as the elasticity of substitution 
increases, savings out of the tax cut increases. This is because if a is 
large, the marginal product of capital does not drop as rapidly during 
the accumulation of capital, resulting in a rate of interest that declines 
less rapidly. The impact of the initial tax rate is ambiguous but also 
not large. Finally, note that p/p is substantially larger than one. There- 


r> This implies that our i excludes consumer durables, an appropriate assumption 
here since their services are not taxed. 
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fore, future tax and spending changes are discounted heavily in the 
computation of the initial impact on capital formation, equation (18). 

Before ending the analysis of this policy shock we should discuss 
the case where the budget is eventually balanced by cutting consump¬ 
tion of public goods that are perfect substitutes for private goods. It is 
straightforward from (18), (22), (23), and the fact that p. > p that this 
case is equivalent to a tax cut with either no change in g or a change in 
g in the infinite future. For the parameter values we are examining, 
400 periods is practically infinity since the positive eigenvalue sub¬ 
stantially exceeds the pure rate of time preference. Therefore, table 1 
tells us that when (3 exceeds 0,5, the MPS out of a dollar in tax cuts, 
financed eventually by increases in lump-sum taxes, is at most 1.5 and 
more likely about 1.2. Such balanced-budget changes in taxation and 
government expenditures therefore lead to capital accumulation im¬ 
mediately. However, note that the stimulus to capital formation due 
to these tax cuts, about 20—50 cents per dollar of tax cuts, is generally 
smaller than the capital decumulation from a dollar in tax cuts that 
will be balanced by a cut in g, especially if the cut in g is expected to 
occur in the near future. Hence, if P is not at the low end of the range, 
tax cuts financed by roughly equal increases in lump-sum taxes (or 
cuts in rebates), and cuts in government consumption, g, will depress 
investment since the capital decumulation induced by the latter will 
likely be the stronger influence on current investment. 

VI. Conclusions 

The primary accomplishment of this paper was the development of 
analytical tools for determining short-run consequences of fiscal pol¬ 
icy in a perfect foresight model. These tools were applied to basic 
macroeconomic questions with strong results. We have seen that it is 
possible that a program of tax cuts today followed later by cuts in 
government consumption will initiate a period of nontrivial capital 
decumulation ending only when the spending cuts are initiated. We 
also found that it is unclear how future investment tax credits affect 
investment today, as they are stimulative for slow-adjusting econo¬ 
mies and depressing to current investment in fast-adjusting econo¬ 
mies. 

The techniques used here are applicable to a wide variety of issues. 
For example, Judd (1981, 1983) uses them to analyze the excess bur¬ 
den of factor taxation in extensions of this model. 

The major conclusion that follows from this analysis is that the 
long-run forces acting on an economy do matter in the short run in a 
quantitatively significant fashion. While conventional macroeconom¬ 
ics may be correct in arguing that other forces are important in the 
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short run because of various rigidities, the results here show that the 
underlying long-run real forces cannot be ignored in short-run analy¬ 
sis. Just as significant, the analytical determination of these effects, 
taking into account the dynamic adjustment process, is a tractable 
exercise. 

Appendix 

Proof of Lemma I 

This is seen in two parts. Since the system is autonomous after T, then for all 
€, k(t, e) must be on the stable manifold for / > T. Theorem 5 ol Otani (1982) 
(which applies here since our equilibrium solves some optimization problem; 
see Abel and Blanchard [ 1983]) shows that *»(/, 0) is bounded for t > T for any 
finite value of k «(7\ 0). Since * and q must be on the stable manifold of the 
asymptotic autonomous system at l — T, this stable manifold is the terminal 
surface when we view the problem for t € [0, T]. Let k(t, e; q 0 ) and q(t, «; </») 
denote the solutions to (9) with k(0, e; q ( ,) = k fl and ^(0, e; q») = q 0 ■ Then 
(dk/dq () {T, 0; q n ) > 0 around the steady state under examination because of the 
local saddle-point nature of the flows. However, h decreases with q along any 
stable manifold. Hence, for small e, there is a unique qf> that causes [k(T, e; q * t ), 
q(T, t; qft] to be on the stable manifold of the system after T. Furthermore, 
due to the C 2 nature of the differential system, the dependence of q» and k on 
« is differentiable for all t i [0, T] (see Coddington and Levinson 1955). 
Therefore, 

k.a\ 0) = -ii- (T, 0, y, 1 !) + (T, », 9 |!) 

dt T dq„ <>t 

is finite and k t (t, 0) is uniformly bounded in t. 


References 

Abel, Andrew B. “Dynamic F.ffects of Permanent and Temporary Tax 
Policies in a q Model of Investment ." J. Monetary Earn. 9 (May 1982): 353— 
73. 

Abel, Andrew B., and Blanchard, Olivier J. “An Intertemporal Model of 
Saving and Investment.” Econometnca 51 (May 1983): 675—92. 

Auerbach, Alan J., and Kotlikoff, Laurence J. “National Savings, Economic 
Welfare, and the Structure of Taxation.” In Behavioral Simulation Methods m 
Tax Policy Analysis, edited by Martin Feldstein. Chicago: Univ. Chicago 
Press (for N.B E.R.), 1983. 

Benveniste, Lawrence M., and Scheinkman, Jos6 A. “Duality Theory for Dy¬ 
namic Optimization Models of Economics; The Continuous Time Case." J. 
Econ. Theory 27 (June 1982): 1-19. 

Berndt, Ernst R. “Reconciling Alternative Estimates of the Elasticity of Sub¬ 
stitution.” Rev. Econ. and Statis. 58 (February 1976): 59—68. 

Blinder, Alan S., and Solow, Robert M. "Does Fiscal Policy Matter?"y. Public 
Econ. 2 (November 1973): 319-37. 

Brock, William A., and Turnovsky, Stephen J. “The Analysis of Mac¬ 
roeconomic Policies in Perfect Foresight Equilibrium." Internal. Econ. Rev. 
22 (February 1981): 179-209. 



PERFECT FORESIGHT MODEL 


319 


Cass, David, and Yaari, Menahem E. “Individual Saving, Aggregate Capital 
Accumulation, and Efficient Growth.” In Essays on the Theory oj Optimal 
Economic Growth, edited by Karl Shell. Cambridge, Mass.: MIT Press, 1967. 

Coddington, E. A., and Levinson, N. Theory of Ordinary Differential Equations. 
New York: McGraw-Hill, 1955. 

Diamond, Peter A. “Incidence of an Interest Income Tax.” J. Econ. Theory 2 
(September 1970): 211—24. 

Grossman, Sanford J., and Shiller, Robert J. “The Determinants of the Vari¬ 
ability of Stock Market Prices.” A.E.R. Papers and Proc. 71 (May 1981): 222- 
27. 

Hall, Robert E. “The Dynamic Effects of Fiscal Policy in an Economy with 
Foresight." Rev. Econ. Studies 38 (April 1971): 229—44. 

Hansen, Lars Peter, and Singleton, Kenneth J. “Generalized Instrumental 
Variables Estimation of Nonlinear Rational Expectations Models.” 
Econometrica 50 (September 1982): 1269-86. 

-. “Stochastic Consumption, Risk Aversion, and the Temporal Behavior 

of Asset Returns.” J.P.E. 91 (April 1983): 249-65. 

Judd, Kenneth L. “Dynamic Tax Theory: Exercises in Voodoo Economics." 
Mimeographed. Evanston, Ill.: Northwestern Univ., 1981. 

-. "Factor Taxation in a Perfect Foresight Model.” Mimeographed. 

Evanston, Ill.: Northwestern Univ., 1983. 

Kotlikoff, Laurence J., and Summers, Lawrence H. “The Role oflntergener- 
ational Transfers in Aggregate Capital Accumulation.” J.P.E. 89 (August 
1981): 706-32. 

Lucas, Robert F.., Jr. “Labor-Capital Substitution in U.S. Manufacturing.” In 
The Taxation of Income from Capital, edited by Arnold C. Harberger and 
Martin J. Bailey. Washington: Brookings Inst., 1969. 

Nerlove, Marc. "Recent Empirical Studies of the CES and Relation Produc¬ 
tion Functions.” In The Theory and Empirical Analysis of Production, edited by 
Murray Brown. New York: Columbia Univ. Press (for N.B.E.R.), 1967. 

Oniki, Hajime. “Comparative Dynamics (Sensitivity Analysis) in Optimal 
Control Theory.” J. Econ. Theory 6 (June 1973): 265-83. 

Otani, Kiyoshi. “Explicit Formulae of Comparative Dynamics." Inlemat. Econ. 
Rev. 23 (June 1982): 411 — 19. 

Phlips, Louis. “The Demand for Leisure and Money.” Econometrica 46 (Sep¬ 
tember 1978): 1025-43. 

Tobin, James, and Buiter, Willem. “Long-Run Effects of Fiscal and Monetary 
Policy on Aggregate Demand.” In Monetarism, edited by Jerome L. Stein. 
Amsterdam: North-Holland, 1976. 

Turnovsky, Stephen J. Macroeconomic Analysis and Stabilization Policies. Cam¬ 
bridge: Cambridge Univ. Press, 1977. 

Weber, Warren E. “The Effect of Interest Rates on Aggregate Consump¬ 
tion." A.E.R. 60 (September 1970): 591-600. 

-. “Interest Rates, Inflation and Consumer Expenditures." A.E.R. 65 

(December 1975): 843-58, 



In Search of Predatory Pricing 
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Can predatory pricing be reproduced in a laboratory environment? 
We report research motivated by this objective. We began with con¬ 
ditions that, based on the literature, appeared to combine the fea¬ 
tures this literature has suggested are favorable to the emergence of 
predation. Next we operationalized what was meant by predatory 
pricing in our design in order to compare prices with predictions 
from alternative theories. Of 10 experiments, none evidenced pred¬ 
atory behavior; most supported the dominant firm theory. The sec¬ 
ond series of experiments addresses remedies for predation and 
finds that the effect is to increase prices and reduce efficiency. 


I. Overview of Research Procedure and Results 

Is predatory pricing an observable phenomenon lhat can be induced 
in a laboratory environment? We report research motivated by the 
maintained hypothesis that if such behavior is a human trait we ought 
to be able to observe it in the laboratory. Our procedure was first to 
specify a set of structural conditions that appeared to us to combine 
those features that were favorable to the emergence of predatory 
behavior: (1) two firms—one large, one small; (2) scale economies, 
with the larger firm having a cost advantage over the smaller (but with 
the smaller firm's production required for market efficiency); (3) a 
“deep pocket” possessed by the advantaged firm; and (4) sunk entry 
costs tending to discourage reentry when such costs must be incurred. 
Next we constructed an experimental design to operationalize these 
conditions and to define predatory pricing within this design. In this 
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design, predatory prices are distinct from several alternatives: com¬ 
petitive prices, the shared monopoly price, the dominant firm price, 
and Edgeworth-style price cycles. 

Our first three experiments were conducted with attributes 1-3. 
The second series added feature 4. After six experiments, we still had 
not observed predatory pricing. A reconsideration of the literature 
suggested that most predatory pricing theories implicitly assumed a 
fifth feature: (5) firms have complete information about competitors’ 
costs. Although we do not consider complete information a realizable 
field condition in most (if any) markets, we decided this condition 
should be included in the search for predatory pricing behavior. Our 
third series of experiments, incorporating conditions 1—5, still pro¬ 
duced no evidence of predatory pricing. Because some scholars have 
suggested that predatory pricing, if it exists, is driven by goals other 
than profit maximization, we attempted, without success, to generate 
“cut-throat” pricing in one experiment by inducing rivalistic incen¬ 
tives. At this point we wondered if there were something artifactual 
about our experimental design that, unsuspected by us, would inhibit 
the small firm’s being driven out of the market even if the large firm 
posted prices and quantities below marginal cost. For example, do 
subjects who are assigned the small firm’s structural conditions per¬ 
ceive themselves as duty bound to remain in the market? If this were 
the case, then the predicted effect of predation would not be ob¬ 
served, even if we did observe predatory price levels quoted by the 
large firm. So we conducted one experiment in which, unknown to 
the small firm, the large firm was a confederate of the experimenters 
and was instructed to price repeatedly at predatory levels. This 
prompted the small firm to leave the market, and therefore we were 
confident of our small firm’s vulnerability to being forced out of the 
market by a determined predator. 1 

The second part of the research program had the objective of ex¬ 
amining proposed antitrust remedies for predatory pricing that 
might be imposed on an industry thought to be subject to predation. 
For our antitrust treatment condition, we applied a semipermanent 
price reduction rule (Baumol 1979) and a quantity expansion limit 
(Williamson 1977). We conducted seven experiments (series 6) with 
attributes 1—4 and with these two antitrust restrictions. Since no pre¬ 
dation was found in the 11 experiments based on conditions 1—4 

1 While the negative results of the 11 experiments we report cannot prove that 
predatory pricing does not exist, we feel that they alter the burden of proof for those 
who would design public policy as though predation were a robust phenomenon. We 
invite antitrust scholars to scrutinize our experimental design, to suggest specific ways 
in which they would alter it, and to state the corresponding outcomes they are prepared 
to predict. We will take their suggestions seriously. 
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alone, we interpret this series of seven experiments as a test for the 
existence of type 2 regulatory error, that is, whether adopting anti- 
predatory pricing rules might induce anticompetitive incentive ef¬ 
fects. 

Table 1 summarizes the treatment conditions underlying the ex¬ 
periments in each of the series 1-6. 


II. Predatory Pricing: From the Literature to 
Experimental Design 

The idea that there is a distinction to be made between the price that 
is low because of good competition and the price that is low because of 
bad predation is well established in American legal and political his¬ 
tory. It appears in early Supreme Court decisions subsequent to the 
enactment of the Sherman Act (e.g., Trans-Missouri Freight case and 
Standard Oil case). 2 

Economists J. B. arid J. M. Clark, in a book chapter subtitled “De¬ 
structive Competition,” describe a process of selective price cutting 
that is similar to the contemporary concept of predatory pricing, and 
Senator Estes Kefauver, prominent in the development of modern 
congressional antitrust policy, has mourned the passage of the era of 
“independent” bakeries. 1 Private antitrust cases and threats of litiga¬ 
tion flourish. Of course, the existence of such cases does not necessar¬ 
ily demonstrate the existence of predation, since there clearly are 
other incentives for firms to assert that they are victims of predation. ‘ 

* U.S. v. Trans-Missouri Freight Association, 166 U.S. 290, 328 (1897) The court 
suggested that monopolization may involve strategic price reductions that may drive 
out of business "the small dealers and worthy men whose lives have been spent therein 
and who might be unable to readjust themselves to their altered surroundings." Stan¬ 
dard Oil Company of New Jersey v. U.S., 221 U.S. 1 (1911). In this case the court 
implied that predation had replaced productive forms of business behavior: “The very 
genius for commercial development and organization which it would seem was mani¬ 
fested from the beginning soon begot an intent and purpose to exclude others which 
was frequently manifested by at ts and dealings wholly inconsistent with the theory that 
they were made with the single conception of advancing the development of business 
power by usual methods, but which on the contrary necessarily involved the intent to 
drive others from the field and to exclude them from their right to trade and thus 
accomplish the mastery which was the end in view." 

3 This chapter is contained in Clark and Clark, The Control of Trusts (1912). Lest 
anyone confuse destructive competition with healthy price rivalry, they (p. 98) call such 
practices “refined forms of robbery" and demand that “the illegitimate breaking of a 
general scale of pricey must, in some way, be stopped." Sen. Kefauver states (1965, p. 
139) that many independents “personally know small bakers who have been destroyed 
by engaging in competitive warfare with the majors." 

1 In International Air Industries et al. v. American Excelsior Company, 517 F.2d 714 
(1975), cert, denied, 424 U.S. 943 (1975), predation was alleged in the “evaporative- 
cooler pad" industry. The courts rejected the claim, stating. “It would appear that [the 
defendant] was selling its cooler pads at a price far above even its average cost. More- 
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Our task in the present research was to operationalize the concept 
of predation into a reasonable economic design with testable predic¬ 
tions. Our goal was to create an economic environment that we felt 
would have a “best shot” at observing predatory pricing. Unfortu¬ 
nately, we found no single universally accepted model of predatory 
pricing. However, we were able to identify several important design 
elements to use in some or in all of our experiments. 

The trading environment we chose for our investigation is that of 
firms producing to order a homogeneous product for sale in a posted- 
offer market with full demand revelation. 5 Other design features 
were identified from our reading of the literature in predatory pric¬ 
ing. These are presented in the paragraphs below. Finally, in the last 
seven experiments, we conducted the markets with a predatory pric¬ 
ing antitrust program (PPAP), which is described in paragraph 7 
below. 

1. Number ofprms. Every source we consulted spoke of predation by 
a single predator. However, the prey may be singular (Salop 1981, p. 
11) or plural (Scherer 1980, p. 335; Kreps and Wilson 1982; Milgrom 
and Roberts 1982; Selten 1978). Because of our previous experience 
with two-firm markets (Coursey, Isaac, and Smith 1984; Coursey, 
Isaac, Luke, and Smith 1984; hereafter CIS and CILS), we decided to 
continue w ith this design feature, in this case, however, the two firms 
were not symmetric in costs. 

2. Costs of the prms. The literature appears to be in disagreement 
whether predator and prey are to be distinguished by costs. Some 
(McGee 1958, p. 140) say no. Others (Ordover and Willig 1981, p. 
308; Salop 1981, j>. 19) seem to suggest that while costs may be equal, 
they also may not. Still others build predation models explicitly 
around the concept of a dominant firm that has some cost advantage 
(Gaskins, as quoted in Scherer 1980, p. 338). Our previous experi¬ 
ences (CIS, CILS) with the symmetric cost case were marked by a 
complete absence of any success of one firm in achieving unchecked 


over, the record indicates that barriers to entry in the cooler pad market were virtually 
non-existent." This is not to indicate that the court ignored marginal costs. They seem 
to be following an Areeda and Turner (1975) model in which average cost is used in 
certain instances as a proxy for the more important (but less observable) marginal cost 
With regard to the issue ol entry costs, the court estimated that “the total costs of 
entering the market on a scale large enough to supply the entire southwestern and far 
western United States" was less than $300,000, 

’’ The made-to-order nature of production does not allow for carryover of stock from 
one period to another, and it eliminates the costs and risks of holding unsold stock. 
This ensures that the induced marginal cost schedules generate replicated periods with 
identical well-defined (flow) supply conditions. To our knowledge the literature no¬ 
where suggests that predation is related to production for inventory. 
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Fig. I.—Seller costs, buyer values, and supply and demand conditions 


monopoly power. Therefore, in order to create conditions more fa¬ 
vorable to obtaining predation, we chose to give the predator an 
important cost advantage over the prey (hereafter called the “large" 
or “small” firm), but the small firm, although perhaps disadvantaged 
in our design, is efficient enough to be in production at a Pareto- 
efficient competitive equilibrium (Kefauver 1965, p. 144; Ordover 
and Willig 1981, p. 308). 

These cost conditions were obtained via the induced seller marginal 
cost schedules exhibited in figure la. Figure lb exhibits the market 
supply and demand conditions. From figures 1 a and lb several im¬ 
portant attributes of our laboratory market can be noted. At the 
competitive equilibrium ( P r E [2.66, 2.76)) both firms are producing. 
Seller A sells 7 units while seller B sells 3 units. Furthermore, there 
exist combinations of price and quantity for seller A (2.60 =5 P A < 
2.66; 8 < Qa Id) such that seller A can exclude seller B from the 
market and yet earn a positive cash flow of returns from the ex¬ 
perimenters. 

3. Deep pocket. Many sources describe the predator as having a capi¬ 
tal market advantage over the prey via what is popularly described as 
a deep pocket. The Wall Street Journal (1983) says of an FTC decision: 
“Critics of the agency say the new formula will let Borden price 
ReaLemon below its true costs in areas where Borden faces competi¬ 
tion, while making up the difference in areas where ReaLemon enjoys 
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a monopoly.” This is almost a twin argument to a hypothetical situa¬ 
tion described by Clark and Clark (1912, p. 97). Scherer (1980) 
quotes Edwards as saying of the predator, “the length of its purse 
assures it of victory.” Scherer says directly of the predator, “it sub¬ 
sidizes its predatory operations with profits from other markets.” 
Salop (1981, p. 11) defines the deep pocket as the case “in which an 
incumbent predator has superior access to financial resources.” The 
idea is also mentioned by Kefauver (1965, pp. 146—49). We provided 
a deep pocket in the following manner: Since economic losses were a 
real possibility in our design, each seller was provided an up-front 
capital endowment. However, in all cases the endowment to seller A 
(the potential predator) was double the endowment to seller B. (Also, 
firm A’s pocket was further deepened under treatment 4 below, 
which gives firm A the advantage of incumbency, as an uncontested 
monopolist, for the first 5 periods of each experiment.) 

4. Sunk cost entry and reentry barriers. A common theme in the de¬ 
scriptions of conditions favorable to predators is the requirement that 
the small firm face barriers to entry or reentry. This raises the sepa¬ 
rate but related issue of what constitutes an effective barrier to entry. 
The contemporary debate on the contestable markets hypothesis con¬ 
cerns whether economies of scale alone can fulfill this requirement. If 
economies of scale do serve as an effective barrier, then our design 
features 1, 2, and 3 above might be sufficient to provide requisite 
hurdles. However, our previous research (CIS and CILS) suggests 
that scale economies alone might not provide a sufficient barrier to 
entry. Therefore, as an additional potential entry barrier, we add a 
fourth item suggested by Ordover and Willig (1981, p. 305), namely, 
sunk cost of entry or reentry. Furthermore, we require, at the time 
the small firm is making its entry decision, that the large firm should 
have an incumbency advantage that entails some privately held 
knowledge about the nature of demand and an initially irreversible 
commitment of already having “sunk” the entry cost. (See also Salop 
1981, pp. 16-20.) 

The sunk entry cost was obtained, as in CILS, by requiring sellers to 
purchase an entry permit before each was allowed to participate in 
the market. Permits cost $1.00 each and were good for only 5 con¬ 
secutive periods. At $1.00 each, the permit charge represents two- 
thirds of the small firm’s maximum 5-period earnings at competitive 
prices. To create the incumbency advantage, the experiments with 
this design feature opened with seller A required to purchase two 


6 But see Brozen (1982, pp. 330-33) tor a well-documented compilation of argu¬ 
ments skeptical of the cross-subsidization, deep-pocket, entry barrier hypothesis. 
Among the quotes are two from Scherer. 
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permits good for periods 1-10. Seller B was not allowed the option of 
entering until period 6. Thus, at the point when seller B had to make 
an initial entry decision, seller A had irrevocably sunk enough costs to 
be in the market for (at least) 5 more periods. Also at the beginning of 
period 6, seller A had a 5-period advantage in obtaining private infor¬ 
mation about market demand and in deepening his purse. 

5. Information. Much of the literature makes no explicit reference 
to the information available to the firms, yet most appear to assume 
implicitly that firms have complete information about each other’s 
costs. (Exceptions are found in Salop [1981], Kreps and Wilson 
[1982], and Milgrom and Roberts [1982].) In most of our experi¬ 
ments, the firms did not know one another’s cost structure and 
neither knew demand. However, in three experiments we introduced 
complete cost information. In these experiments, participants had 
been in a previous predatory pricing experiment (although not with 
one another). Each was assigned the opposite of his previous position 
(so as to have sellers who knew what it was like to be on the other side), 
and each was given a written table of the other’s costs (to refresh their 
memory). 

6. Rivalry. Implicit in many of the discussions of predation is the 
issue of intent. If we cannot distinguish predatory from healthy forms 
of competition on the basis of performance variables, then intent is a 
logical direction for attempts at a distinction to take. Unfortunately, 
an intent-based standard is highly subjective. 

In one of our experiments, we introduced a treatment distinction 
between the large firm’s normal desire to exclude the smaller firm 
(based on a presumed profit-maximizing calculus) and a desire to 
exclude based on a rivalistic, abnormal intent. When this rivalistic 
feature is in effect, the large firm is told privately that it will receive a 
$1.00 cash bonus for each period in which the smaller firm chooses 
not to enter the market. In effect we attempt to induce a direct utility 
to A for excluding B, which is motivated by the conjecture that preda¬ 
tion may occur but not spring from a profit-maximizing intent. 

7. Predatory pricing antitrust program (PPAP). Our final seven exper¬ 
iments were conducted with this PPAP in place. This was operational¬ 
ized in two parts. First, the incumbent firm faced an output expansion 
limit. Whenever the smaller firm entered the market (i.e., seller B 
bought a permit in period t when he or she did not have a permit in 
period t - 1), seller A could not expand his or her maximum quantity 
offered for sale for 2 periods. Second, the incumbent faced a 
semipermanent price reduction regulation. During any of the periods 
in which seller B could be in the market, all of seller A’s price reduc¬ 
tions (if they occurred) had to be maintained for at least 5 consecutive 
periods. 
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III. The plato Posted-Offer Procedure 


Most retail markets are organized under what has been called the 
posted-offer institution (Plott and Smith 1978). As we define it, in this 
institution each seller independently posts a take-it-or-leave-it price at 
which deliveries will be made in quantities selected by each individual 
buyer subject to seller capacity limits. These posted prices may be 
changed or reviewed frequently, infrequently, regularly, or irregu¬ 
larly, but in any case a central characteristic of this mechanism is that 
the posted price is not subject to negotiation. 

The experiments reported here use the posted-offer mechanism 
programmed for the pi.ato computer system by Ketcham (see 
Ketcham, Smith, and Williams 1984). This program allows buyers 
and sellers, sitting separately at plato terminals, to trade for a max¬ 
imum of 25 market “days" or pricing periods. Each display screen 
shows that subject’s record sheet, which lists the maximum units that 
can be purchased (sold) in each period. For each unit, the buyer 
(seller) has a marginal valuadon (cost) that represents the value (cost) 
of purchasing (selling) that unit. These controlled, strictly private, 
unit valuations (costs) induce individual and aggregate market 
theoretical supply and demand schedules (Smith 1976). That is, in an 
experiment, buyers (sellers) earn cash rewards equal to the difference 
between the marginal value (selling price) of a unit and its purchase 
price (marginal cost). Sales are “to order” in the sense that there are 
no penalties or carryover inventories associated with units not sold (or 
units not purchased). Consequently the assigned marginal valuations 
and costs induce well-defined (low supply and demand conditions. 

Each period begins with a request that sellers select a price offer by 
typing a price into the computer keyset. This offer is displayed pri¬ 
vately on the seller’s screen. The seller is then asked to select a corre¬ 
sponding quantity at that offer price. Because the essence of the 
predatory pricing hypotheses is that a seller may have a strategic 
reason for pricing below marginal cost, the program we utilized 
placed no restriction except for an ultimate capacity constraint on 
what combination of prices and quantities any seller was permitted to 
post. 

Since it is time and effort costly for a seller to calculate the profit 
that any given offer may provide, especially with U-shaped costs, 
plato always informs the seller of the potential profit (loss) if all 
offered units are sold. When satisfied with the selected price and 
quantity, the seller presses a touch-sensitive “offer box’’ displayed on 
the screen. This action places that seller’s offer irrevocably into the 
market. Before touching the offer box the seller may change the price 
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and/or quantity as many times as desired. Each seller sees the prices 
posted by the other seller only after both have entered their offers by 
touching the offer boxes. 

Because virtually all of the hypotheses regarding predatory pricing 
explicitly or implicitly assume that buyers act to fully reveal market 
demand, we needed to incorporate this feature into all 18 of our 
experiments. To do this, we used the computerized buyer subroutine 
that had proved successful in previous research (CILS 1984). After 
both sellers entered their offers, plato randomly ordered each of the 
five buyers in figure 1 into a buying sequence, just as with human 
subject buyers. However, the purchasing decisions were made by a 
plato program with the buying rule that demand was fully revealed. 
That this computerized response would take place, and that the 
buyers would purchase all that was profitable to them at the given 
prices, was explained to the sellers so that it was not credible for 
sellers to harbor even the expectation that demand might be underre¬ 
vealed. A trading period ended when the last buyer completed this 
buying mode. Sellers were not told what the final period of the exper¬ 
iment would be. 

There is no difference in physical surroundings or computer in¬ 
teraction depending on whether or not a seller has purchased an 
entry permit. This was done to minimize any extraneous incentives to 
purchase or not to purchase a permit. A seller who chooses not to 
purchase an entry permit remains at the terminal, watching the prices 
posted by the other seller. Since this is a posted-offer market, sellers 
with and without permits are equally passive in computer terminal 
responsibilities once the market has opened to the buyers. 


IV. Alternative Hypotheses 

Predatory pricing is a hypothesis about firm behavior under certain 
structural conditions, which, according to our interpretation, is repre¬ 
sented by specifications 1—5 in Section II. However, in addition to the 
predatory pricing literature, an extensive oligopoly literature has 
identified many other hypothesized modes of pricing behavior when 
numbers are few. Because we could not be sure that we would observe 
predatory behavior, it was necessary for us to consider alternatives 
that we might observe under conditions thought to favor such behav¬ 
ior. This necessity was underlined when the first few experiments 
failed to yield predatory pricing, and therefore early in the search 
process we were motivated to specify alternative outcome hypotheses. 
Although the literature was helpful in suggesting alternative hypoth¬ 
eses we do not find it helpful in providing a coherent, clear delinea- 
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tion of the conditions necessary to yield each of the various modes of 
pricing behavior. In view of this we decided to err on the side of 
overspecification by assuming that any behavioral hypothesis might 
apply as long as it assumed that one or both firms choose prices. This 
seems to rule out only the Cournot quantity-adjuster model of 
oligopoly. 

A. Predatory Pricing 

Based on the literature summarized in Section II above there are two 
elements in the definition of predatory pricing. First, the price 
charged by the predator is lower than would be optimal in a simple 
myopic (short-run) pricing strategy. Second, the price has the effect 
of preventing entry, or driving out and preventing reentry, of the 
prey. In our experimental design if there is a predator we expect it to 
be firm A, with firm B the prey, since we assume that predatory action 
by firm B would be suicidal and that the agent for firm B will become 
aware of this assessment. Therefore, we interpret the first element to 
mean that P\(Q\) < where P\(Qa) is the inverse demand 

function, and the second element to mean that Pa(Qa) < min 
^Cb((£b)- for seller A in our design, price offers below $2.66 are 
potentially predatory, depending on the quantity offer chosen by A. 
For example, if seller A posts a price of $2.64 but limits the quantity 
offered to 7 units, then / j a ( 7) > MC A (7), and this strategy leaves some 
(contingent) excess demand in the market for firm B to satisfy at a 
higher price. We define a predatory action by seller A to be a posted 
price less than $2.66 accompanied by the selection of a quantity of at 
least 8 units. Thus, a predatory action is defined by the choice (P,\ £ 
[$2.60, $2.65], Qa S 8) since this yields P\(Qa) < MC a (Qa) and P\(Qa) 
< min AC b (Qb)- Since P\(Qa) > AVCa(Q\), this predatory action still 
generates a positive profit for firm A; our design is deliberately 
rigged to allow firm A to predate without imposing a loss on himself. 
Consequently, although there is a short-run opportunity cost of pre¬ 
dation to the predator there is no net out-of-pocket loss. 

In summary, if we observe firm A choosing predatory prices and 
quantities that conform to this definition followed by firm B exiting 
and not reentering even when firm A subsequently increases its price, 
we will count such an observation as supporting the predatory pricing 
hypothesis. However, if we observe firm B exiting the market and 
electing not to reenter in response to price cutting by firm A that is 
not predatory, we will interpret this to mean that firm B is particularly 
vulnerable to price-cutting actions, and we will still count such an 
observation as supporting the predatory pricing hypothesis. Behavior 
of this type would suggest that firm A had established a credible 
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predatory threat to firm B without pricing in the defined predatory 
range. 

B. Competitive Equilibrium 

If predatory behavior is manifest, but it fails to eliminate the small 
firm from the market, the result may be to spoil any effective tacit 
cooperative coupling between the two firms. Hence, the competitive 
equilibrium may prevail, as a default outcome, from the failure of 
predatory attempts. Alternatively, price cutting may be less severe 
than the predation model suggests but be sufficient to lock the two 
firms into the competitive equilibrium. The extensive experimental 
evidence favoring the competitive equilibrium, under different trad¬ 
ing institutions, when numbers are few suggests a strong a priori case 
for this hypothesis in the present design. In this design, competitive 
equilibria are defined by = 7, Qh = 3, and prices in the interval 
[2.66, 2.76], 

C. Dominant Firm Equilibrium 

If firm B is assumed to be a price taker, or adapts to its disadvantaged 
position by becoming a price taker, then a possible outcome is that of 
the dominant firm equilibrium. This is often associated with a leader- 
follower argument or a minorant game institution in which firm A 
moves first and firm B moves last, responding with the quantity that 
maximizes profit given the price quoted by A. In a repeat simultane¬ 
ous move game, firm B might still be regarded as moving after B, in 
the subsequent period, so that the leader-follower posture could still 
emerge. 

The traditional analysis yields the dominant firm equilibrium price 
(f’tii = $2.84) and corresponding quantities <2 a = 6 and (7 B = 3. This 
is obtained by assuming that firm B matches any price posted by A 
and chooses the quantity that maximizes B’s profit. 

In preparing this design, we calculated the joint profit matrix for 
firms A and B for a subset of the feasible prices that can be posted by 
them. In this matrix, if firm A posts the dominant firm equilibrium 
price, Pd! = $2.84, then the best response of firm B is also to post this 
price. At these prices firm A offers 10 units, firm B offers 3 units, and 
expected profits are (it a , n a ) = ($1.99, $0.51) per period. Strictly 
speaking, this is 'Slightly different from the dominant firm model 
profit shares, based on certain demand in which the large firm cedes 
the residual supply to the fringe. In our design, this would require 
seller A to limit Qa to 6 units, and the profit shares would be ($1.96, 
$0.54) per period. 
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D. Edgeworth Price Cycles 

Inspection of the joint profit possibilities also reveals the clear poten¬ 
tial for an Edgeworth cycle in duopoly pricing. If the two firms start at 
the dominant firm equilibrium (P A , Pr) = ($2.84, $2.84), firm A has 
an incentive to cut price one cent to $2.83. But this wipes out the 
profit of firm B, whose best reply is to match A’s price, giving A an 
incentive to cut to $2.82, and so on, until prices fall to (P A , P b) = 
($2.79, $2.79). At this point, A’s incentive is to raise price back to 
$2.84, with B then matching this price. 


E. Shared Monopoly (Tacit Collusion) 

If firms A and B are able to effect cooperation through price signal¬ 
ing, this strategy will be most effective if they (1) maximize joint 
profits and (2) divide this profit in a manner that will sustain the tacit 
“agreement” (the two firms cannot communicate except through the 
prices they select). The largest collective profit is for all production to 
be allocated to firm A, who charges the monopoly price Pm — $3.21 
and sells the quantity Q M = 5. yielding tt m = $3.43 per period for 
firm A. But in the absence of a mechanism for agreement, including 
an imputation of a share of this profit to B, there is no way to effect 
this outcome. Through signaling, it is conceivable that the two firms 
might work out an alternating sequence in which A and B take turns 
satisfying the whole market at their respective monopoly prices. This 
would yield a profit that averages one-hall the monopoly price for 
each firm. Under this scenario we would have (P A , Qa> T\) — ($3.21, 
5, $1.71) and (Ph, (2b, itb) = ($3.52, 3, $1.29). A less sophisticated 
form of tacit collusion would be for the firms to post the same price 
and then share the market according to the demands realized via the 
random choice of firm made by each buyer in the posted-offer mech¬ 
anism. At the shared monopoly price Pm = $3.21, joint profit is a 
maximum, and (-tt a , -ir B ) = ($180, $1.21). But by defecting at a price 
one cent less, firm A can reap a substantial increase in profit. This is 
the case for all matching price strategies, and thus the maintenance of 
such strategies clearly requires cooperation by firm A. The same 
proposition holds for firm B except that the gains from defection are 
much smaller. 

Since the CIS and CILS experiments did not yield any outcomes 
tending to support the attainment of a shared monopoly through tacit 
collusion, we doubted that such outcomes would be likely even in the 
present asymmetric cost design. However, we conjectured that under 
the PPAP treatment, tacit collusion would be more likely. Under this 
treatment firm A is constrained not to expand output for 2 periods 
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after firm B enters, and any price reduction cannot be reversed for 5 
periods. At a collusive high price this constraint makes it more costly 
for A to punish B for defection. Firm B, knowing that any cut in price 
by A cannot be reversed for 5 periods, may be hesitant to defect and 
risk being locked into a lower price pattern. Similarly, at low prices if 
A signals with a price increase, this action may have greater credibility 
under the PPAP for firm B and may increase the probability that B 
will follow. 


F. Relative Profitability of Alternative Outcomes 

Some of the debate on the appropriateness of predation models has 
centered on the profitability to the large firm of a predation strategy 
relative to alternative tactics. Two observations regarding this discus¬ 
sion relate to our experimental design. First, we note (McGee 1958) 
that one commonly proposed alternative to a strategy of predation is a 
buyout of firm B by A. This is not allowable in our particular design 
but could be incorporated into an extension of our design. 7 Second, 
notice that seller A makes a profit of $3.42 per period as an uncon¬ 
tested monopolist, $ 1.99 per period in expected profits as a dominant 
firm, from $1.10 to $1.80 per period in the competitive price range, 
and (at most) $0.88 per period with a predation strategy. Thus, the 
reader should note that firm A’s estimates of the direct profitability of 
predation depend crucially on A’s expectations about firm B’s exit 
behavior. Firm A’s decision to pursue a predatory strategy depends, 
furthermore, on the profitability of predation relative to the profits A 
expects to receive if B stays in the market. Predation will look less 
attractive if firm A expects that the two firms will stabilize at a collu¬ 
sive price level near the shared monopoly price than if A expects that 
having B in the market will cause prices to collapse to the competitive 
range. 

V. Experimental Results 

We report the results of 18 experiments using the six different treat¬ 
ment conditions shown in table 1. Series 1—5, consisting of 11 experi¬ 
ments, imposed alternative conditions thought to be favorable (per- 


7 An obvious question is whether our prohibition against mergers makes predation 
more or less likely. If a buy-out is, as suggested by McGee (1980), a relatively attractive 
substitute for predation, then forbidding such a substitute strategy is consistent with 
our goal of creating conditions in which predation is relatively likely. However, Burns 
(1984) has suggested that in the case of the old American Tobacco Company, buy-outs 
may have been an integral part of a predatory campaign. 
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haps progressively more favorable) lo the emergence of predatory 
pricing behavior. 

Figures 2-5 chart the sequential prices and corresponding sales 
quantities for one experiment from each of the series 1, 2, 3, and 6. 
Posted prices in each period are indicated by the solid and open 
circles. Thus in figure 2 (experiment 129) for period 8 the solid circle, 
denoted “A8,” shows that A sold 8 units at the posted price, $2.90. 
The open circle, denoted “BO," indicates that B sold zero units at the 
posted price, $3.15. Periods such as 1—5 in figure 3 (experiment 135) 
show only the price posted by the incumbent seller A, indicating that 
seller B was not allowed to purchase a permit in periods 1-5. On all 
charts the monopoly price for A (3.21), the dominant firm price 
(2.84), the competitive interval {2.66, 2.76], and the potential preda¬ 
tory price range [2.60, 2.66] are marked on the far right. Finally, for 
experiment 153 subject to the PPAP, the heavy black arrow near the 
bottom of the chart denotes periods in which seller A has triggered a 
temporary price ceiling on himself through a reduction in price. The 
numbers along the arrow state the operative price ceiling. 

Table 2 summarizes the performance of the 18 experiments. Each 
experiment is scored according to which type of pricing behavior was 
the plurality in the first 18 potentially contested periods: shared mo- 








TABLE 2 

Experimental Outcomes 

(Each Experiment Scored according to Plurality of Price Outcomes in the First 18 
Potentially Contested Periods) 


Pekeormance Hypotheses 


Treatment 

Series 

Shared 

Monopoly 

Dominant 

Firm 

Competitive 

Equilibrium 

Predation: 
Predatory Pricing or 
Monopoly Pricing 
by a Surviving Firm 

1 

1 

(133) 

I'/t 

(129, 131 -tie) 

'/■• 

(131-tie) 

0 

2 

0 

2 

(135, 136) 

1 

(138) 

0 

3 

1 

(140) 

2 

(139, 141) 

0 

0 

4 

0 

1 

(142) 

0 

0 

5 (confederate) 

0 

'/■: (2d half 
of 143) 

0 

'/j (1 st half 
of 143) 

8 (antitrust) 

5 

(145, 146, 
147, 149, 

150, 152, 153) 

2 

(147, 149) 

0 

0 

1-4 pooled 

2 

6'/j 

lVv 

0 
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nopoly, dominant firm, competitive, or predatory (including both 
predatory pricing and monopoly pricing by a successfully predatory 
large firm). Since many observations are not precisely at any of those 
models’ predictions, a simple metric was used in the scoring. Each 
price observation was counted as a hit for the model whose price 
prediction was nearest to the observation. 

From table 2, there are two strongly supported general conclusions: 

(1) the absence of any predatory pricing behavior and (2) the radically 
different behavior of those markets conducted under the antitrust 
treatment rules. Each of these observations will be examined in more 
detail. 

1. The absence of predation. We will summarize our experimental 
results by providing a brief narrative on each of the series 1-5, fol¬ 
lowed by a general discussion of all 10 experiments. 

Series 1. We began our research with three experiments incorporat¬ 
ing design features 1,2, and 3. We found no evidence of what we had 
designated a priori as predation. The three large sellers posted 73 
prices, and none satisfied our definition of predatory pricing. 

There were three instances in which a large seller posted a price in 
the potentially predatory range. But, in each of these three cases, the 
large seller restricted quantity to 7 units, so the price was above both 
marginal and average cost. Each of these instances was in period 1, 
and none of the three large sellers ever repeated a price in this preda¬ 
tory range. It is perhaps arguable that this action can be interpreted 
as either (1) the early period price experimentation of a seller who 
does not have prior knowledge of demand or the costs of the rival or 

(2) a supersophisticated signal of a potential willingness to predate in 
the future. We are highly skeptical of the second interpretation, since 
all three instances occurred in period 1 and none was ever repeated. 
However, whether such strategic signaling behavior was in the minds 
of the large sellers or was capable of predatory interpretation by the 
small seller, the behavior does not match our interpretation of any 
consensus definition of predatory pricing, since price was not below 
average or marginal cost, and in each case the small seller picked up 
some residual demand and a rewarding profit. 

Series 2. After our failure to observe predation in the first three 
experiments, we added design feature 4 (sunk entry costs) for the 
next three and introduced the incumbency treatment. Again, there 
were no predatory price-quantity pairs chosen out of the 69 observa¬ 
tions. In this series, there were only two cases of a price posted in the 
potentially predatory range, and in both cases quantity was restricted 
so price was not less than marginal or average cost. In the experi¬ 
ments requiring that firms purchase an entry permit, we have a 
stronger test of whether large seller pricing activity can successfully 
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signal predatory threats even if not at predatory levels. Of the 54 
periods in which the small sellers could contest the market with a 
permit, they did so in all 54 periods. 

Series 3. In these three experiments we went back to the drawing 
board to see what design features we might add to capture the phe¬ 
nomenon of predation. As described previously in Section II, we 
decided on design feature 5 (complete information). We speculated 
that if both firms were clearly aware of the advantages of the large 
firm, expectations might foster predation or the fear of predation 
leading to the exit of the small firm. We were wrong. None of the 69 
decisions by large firms was predatory. Only one seller A’s price was 
even potentially predatory, and it was, as before, accompanied by a 
quantity restriction. The small firm stayed in the market in 54 out of 
54 possible periods. 

Series 4 (experiment 142). Having failed to find predation in the first 
nine experiments, we wondered whether predation could be induced 
by the creation of “rivalistic” incentives having nothing to do with the 
underlying economic structure. To test this conjecture, we privately 
informed seller A that we would pay him $1.00 for each period in 
which seller B chose not to purchase a permit, The rivalistic seller 
never posted a potentially predatory price; in fact, only once did seller 
A post less than $2.83. The small firm never failed to purchase a 
permit. 

Series 5. After 10 unsuccessful efforts to foster predation, we be¬ 
came seriously concerned that there might be some flaw in our design 
that was muting (what we assumed to be) the vulnerability of the small 
seller. Therefore, in experiment 143, we decided to push rivalry to its 
extreme point and choose as seller A a confederate (a graduate stu¬ 
dent), whose personal incentives were direct instructions to post pred¬ 
atory prices and quantities in periods 1—11. This fact was, obviously, 
concealed from the small firm. Seller B entered the market in period 
6, was shut out in periods 6-10 (incurring the $1.00 permit loss), and 
refused to renew his permit in period 11. His decision not to reenter 
was reiterated to us at the beginning of period 12, and we signaled 
our confederate to begin to try to take advantage of his monopoly 
position. There is perhaps a strong clue to the weakness of the preda¬ 
tory pricing folklore in the subsequent behavior of our small seller. 
Despite being mercilessly pummeled by seller A, losing money, and 
twice deciding not to submit to such punishment again, seller B took 
only 1 period to look at seller A’s price increase (period 12) and 
reentered (period 13) to capture some (perhaps transient) supernor¬ 
mal profits. That is, the difference in firm size and costs, economies of 
scale, nontrivial sunk entry costs, and an asymmetric deep pocket, 
combined with the actual experience of being forced out of the mar- 
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ket due to losses, were not enough to preempt reentry when the 
predator attempted to take advantage of his newly established mo¬ 
nopoly position. While we could have had seller A retaliate, in period 
13 alone seller B earned $1.32, which more than covered his reentry 
cost. This poses two obvious questions for further research. First, how 
much punishment in the form of retaliation by seller A is necessary to 
keep seller B from reentering? Does this required level of retaliation 
so weaken seller A’s profit picture that seller A would be better off 
coexisting with seller B? Second, what would happen if seller B had to 
publicly announce an intention to reenter at least 1 period before 
reentry could occur? 

General discussion of senes 1-5. Although we observe no instances in 
which Pa(Qa) * s in accord with our strict definition of predatory pric¬ 
ing, there are several sequences in which firm A’s pricing behavior 
might be interpreted by firm B as having a predatory quality. For 
example, in experiment 135 (fig. 3), periods 7—9, firm A ignores firm 
B’s repeated signal to raise the price. Then in period 10, firm B 
matches price with firm A, whereupon in periods 11 — 14 firm A re¬ 
peatedly undercuts firm B’s previous price. Firm A eventually seems 
to concede that this strategy is failing and engages in (fruitless) signals 
to raise price in periods 19-23. Similar results obtained in another 
experiment (140 in fig. 4). In both these experiments we can imagine 
that firm B might feel that he or she had been the victim of predatory 
behavior and might be tempted to file suit, given triple damage legal 
incentives and the vagueness of marginal cost in the nonexperimental 
world. 

But if predation is not a satisfactory hypothesis for explaining firm 
behavior in this market environment, then a logical followup question 
is to ask which (if any) of the alternative hypotheses are being sup¬ 
ported. Refer again to table 2 and the pooled results from series 1—4. 
(We exclude series 5, since it incorporated a confederate.) In these 
nine experiments, the modal (in fact, majority) observation supported 
the dominant firm prediction. Of the nine experiments, six-and-a- 
half were best described by the dominant firm model. 

More evidence of the plausibility of the dominant firm model in this 
design can be seen by considering the confederate experiment (143). 
Beginning with period 17, we signaled our confederate to begin post¬ 
ing the pair (P = 2.84, Q = 6). which is the large firm’s dominant firm 
strategy. We wondered if this behavior would indeed attract seller B 
to his competitive fringe strategy (P = 2.84, Q = 3). The answer was 
yes. This suggests that the leader-follower flavor of the dominant firm 
model can be captured in an iterative environment in which, techni¬ 
cally, both firms move simultaneously in any 1 period. 
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One shortcoming of the mutually exclusive fourfold categorization 
of table 2 is that it does not account for our fifth alternative hy¬ 
pothesis, the Edgeworth cycle. This is because these cycle prices in¬ 
clude $2.84, the dominant firm price. Yet it would be useful to ask 
whether the eight experiments in series 1-4 that were scored either 
“competitive” or “dominant firm” were being driven by the dynamics 
suggested by the Edgeworth model. One arbitrary measure is to ask 
whether the runner up, or second place outcome, in the two catego¬ 
ries (competitive, dominant firm) accounted for as many as one-third 
of the number of observations as the primary category. If the answer 
is yes, this indicates a lot of activity between the competitive and 
dominant firm prices, which is at least consistent with the Edgeworth 
model. If the answer is no, this could indicate either acyclical (equili¬ 
brated?) behavior or cycles outside the Edgeworth range (perhaps 
cycles of success and failure in firms’ attempts to establish tacit 
cooperation). 

Using the categorization above, one finds that only two of the first 
10 experiments, 131 and 138, can be classified as Edgeworthian. The 
others show no single consistent pattern. For example, one experi¬ 
ment (141) converged very closely to the dominant firm prediction 
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while another (129 in fig. 2) appeared more unstable. Its cycles away 
from the dominant firm price tended to be toward the monopoly 
rather than the competitive prediction. 

2. The effect of antitrust procedures in series 6. The seven experiments 
incorporating our PPAP were all conducted with design features 1 -4. 
Thus, in the absence of the antitrust rules, the treatment is that of 
series 2. This raises the following question. When one discusses the 
effects of antitrust rules, what is the appropriate control sequence? Is 
it just series 2, or is it the pooled results from series 1-4? Series 2 by 
itself is the more exact structural control, but there are only three 
observations. Pooling adds more information, and the results from 1, 

3, and 4 seem consistent with 2. But pooling runs the risk of introduc¬ 
ing some specification error. We therefore will report both compari¬ 
sons. It happens that the qualitative results are robust with respect to 
the pooled or not-pooled control. 

The fundamental conclusion from our series 6 experiments is the 
existence of a type 2 regulatory error. That is, adding rules against 
predation in an environment where predation might be expected to 
occur may not be benign. Our results show a performance that is less 
competitive and less efficient with the safeguards against predation in 
place. 

This qualitative result can be seen in at least three different ways. 
First, the effect can he seen at a glance in the data of experiment 153 
(fig. 5), which vividly demonstrates the most extreme example we 
observed showing how the antitrust rules can provide incentives for 
tacit cooperation near the monopoly price and quantity. 

Second, ref er again to the classification of the experiments in table 
2. Suppose we combine the observations so that we count each experi¬ 
ment as either (i) a shared monopoly or (ii) not a shared monopoly. 
The proportion of shared monopolies in series b is .71 while in series 
2 it is zero. (A test on this difference in proportions is significant at 
a = .05.) Comparing series 6 with the pooled proportion of series 1- 

4, one gets .71 against .20 (which is also significant at a = .05). Thus, 
we can reject the hypothesis that there was no shift toward shared 
monopoly outcomes when the PPAI* rules were applied. 

Third, one can examine directly the efficiency criterion of market 
performance. In figure 6, we have graphed the period-by-period 
measure of what we call “quasi efficiency.” (This measures the ratio of 
realized surplus obtained by the participants to the maximum possible 
surplus, without attempting to amortize into this ratio the cost of the 
entry permits where they were required.) A fully competitive market 
would be 100 percent efficient by this measure. A fully rationalized 
cartel would score 72.5 percent. Again, a comparison of the charts in 
figure 6 is striking. In every period, the markets with PPAP per- 
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TABLE 3 

Tests of Mean Efficiencies With and Without Antitrust Rules in Effect 



Based on All 

18 Contestable Periods 

Based on 
last 5 periods 

Control is 

Difference = 

Difference = 

series 2 

- 12.17% 

-8.25% 

alone 

t = -2.00* 

t = - 1.57 

Control is 

Difference = 

Difference = 

pooled 

-7.91% 

-7.37% 

series 1-4 

t = - 1.89* 

l = -2.056* 


Not* — Negative numbers indiuir reduced efficiency with antitrust rules 
• .Significant, a » 0*4 


formed less efficiently, on average, than the markets without this 
treatment (using either series 2 or pooled series 1—4 as the control). A 
statistical test of the significance of this difference is presented in table 
3. We present four Mests on the difference in efficiencies in 2 x 2 
dimensional form. Two of these tests use series 2 data as the control; 
two use pooled data from series 1—4. On the second dimension, two 
tests use the mean of all 18 potentially contested periods as the base 
and two use the mean of only the fast five (this latter was a measure 
introduced in Cll.S). In all four cases the direction of the difference 
shows lower market efficiency with the antitrust rules. In three of the 
four tests, the difference is significant using a one-tailed Mest at a = 
.05. 


VI. Conclusions 

Based on the results of 11 predatory pricing experiments, our princi¬ 
pal conclusion is that, so far, the phenomenon has eluded our search. 
We are unable to produce predatory pricing in a structural environ¬ 
ment that, a priori, we thought was favorable to its emergence. The 
predominant outcome is that of the dominant firm equilibrium. 
These results would appear to be consistent with Selten’s (1978) 
game-theoretic analysis of predation in which such a strategy is incon¬ 
sistent with the perfect equilibrium solution concept. By backward 
induction at each stage, predation does not pay. At each stage an 
entrant knows that if it enters and is preyed on it would have been 
better not to enter. But this expectation is offset by the potential 
prey’s also knowing that with actual entry the incumbent is better off 
not to predate. Thus it is rational throughout the history of the mar¬ 
ket for the incumbent not to predate and for the potential prey to 
enter. 
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Where next in the parameter space should one look for predatory 
pricing? We suspect that more work on rivalistic behavior might be 
fruitful. Although our one attempt to introduce rivalistic incentives 
failed to yield a predatory outcome, we still think this is a direction of 
search that might have good results. This direction abandons the 
concept of rational predatory action and is contrary to the main¬ 
stream economic theory exercise. It also abandons the objective of 
asking whether predatory behavior will arise “naturally,” as a human 
trait, in the Laboratory. To deliberately induce rivalistic behavior is to 
assume that such behavior exists in the field but for some reason has 
not been manifest in the laboratory. This calls for harder evidence on 
the mainsprings of behavior in alleged predatory cases in the field 
than we have been able to discern in the literature. 

A second potentially promising direction is suggested by the game- 
theoretical literature on reputation (Kreps and Wilson 1982; Milgrom 
and Roberts 1982). In this literature Selten’s paradox (the inconsis¬ 
tency of predation with perfect equilibria) is resolved by an imperfect 
information assumption—either that agents are uncertain of the pay¬ 
offs of their fellows (Kreps and Wilson 1982) or that they are uncer¬ 
tain whether such payoffs are uncertain (absence of common knowl¬ 
edge) (Milgrom and Roberts 1982). These assumptions lead to models 
of rational predation in which it pays an incumbent to predate follow¬ 
ing entry because the resulting reputation deters future entrants and 
these future benefits outweigh the earlier short-term losses. Of 
course, the imperfect information assumptions were part of our ex¬ 
perimental design, and reputation effects did not arise naturally. 
However, our use of a confederate in one experiment could be ex¬ 
panded to attempt consciously to create reputations of the type inves¬ 
tigated in these models. Again, this involves a departure from the 
search for naturally occurring behavior that is predatory, and the 
justification for this raises methodological issues that have not been 
examined in any depth. 

We think there is a sense in which all of the existing predatory 
models, as well as our experimental design, are deficient. Entry re¬ 
quires capital investment, exit implies divestiture, and (with the ex¬ 
ception of general purpose broadly marketable capital, like trucks) 
the value of an entrant’s capital stock should not be assumed to be 
independent of whether predatory pricing occurs. If the capital stock 
is specialized (e.g., railroad track), an exiting prey will surely not be 
able to recover more than a fraction of replacement cost from any 
potential new entrant. But this means that a new entrant can buy in as 
a competitor at a capital cost that has already discounted the expecta¬ 
tion of predation. Hence some profitability is assured a new entrant, 
while if predation is discontinued, supranormal profits will be en- 
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joyed. Unless the predator buys the discounted capital stock of the 
prey (Burns 1984), predation merely bankrupts the prey firm but fails 
to eliminate the existence of a competitor. Bankruptcy gets rid of 
incumbent management, not capital assets, which are reallocated to 
new managers. 

The results from our seven experiments with the predatory pricing 
antitrust rule form the basis for our second major conclusion. We 
have evidence for the existence of a type 2 regulatory error. The 
antitrust regulations imposed on a market that might be thought to be 
susceptible to predatory pricing caused the market to perform less 
competitively and less efficiently than in the absence of any regula¬ 
tions against predation. We cannot say that any regulations against 
predatory pricing would have this effect, although these results 
graphically display the potential for efficiency losses from programs 
providing for output expansion limits combined with rules requiring 
semipermanence of price reductions. More generally, we believe that 
these results emphasize the necessity for public policymakers to 
realize that any remedies designed to correct alleged market de¬ 
ficiencies may provide counterproductive incentives. Their task may 
become one of evaluating various proposals on the basis of which one 
might result in the largest net benefit, not which one corrects a partic¬ 
ular defect. 
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The Pricing of Forward Contracts 
for Foreign Exchange 


Robert A. Korajczyk 

Northwestern University 


This paper investigates the nature of observed deviations from the 
unbiased expectations hypothesis in the forward foreign exchange 
market. If these deviations are due to risk premia then the same 
premia should be observed in nominal bonds denominated in differ¬ 
ent currencies. This condition imposes testable restrictions on the 
parameters of a multivariate regression model. The empirical results 
are consistent with a world in which time-varying risk premia cause 
the observed deviations from unbiased expectations. 


I. Introduction 

Much of the literature on forward and futures markets is concerned 
with the hypothesis that the current forward (or futures) price is 
equal to the expected value of the spot price at the maturity date of 
the contract. That is, 

£($,+ ,- - 0, (J) 

where 

S, + ( = the price, at time I + 1, for immediate delivery of the good 
(spot price); 

G, — the forward price, set at time t but payable at t + 1, for 
delivery at time t + 1; 

4>, = the set of information available at time and 
£(•) = the expected value operator. 

This paper has benefited from the comments of a number of individuals. Particular 
thanks are due to Craig Ansley, Susan Chaplinsky, Eugene Fama, Jonathan Ingersoll, 
Robert Hodrick, John Huizinga, Allan Kleidon, Merton Miller, Frederic Mishkin, 
Michael Mussa, Arnold Zellner, and an anonymous referee. 
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This “unbiased expectations hypothesis" (UEH) implies that the ex¬ 
pected nominal payoff to holding forward or futures contracts is zero 
(e.g., Telser 1958; Cootner 1960). 

The market for forward delivery of foreign currencies has been 
subjected to numerous tests of UEH. Although some of the results 
are mixed, there is substantial evidence that forward prices for 
foreign exchange are not equal to the best linear predictors of the 
future spot price (e.g., Hansen and Hodrick 1980, 1983; Agmon and 
Amihud 1981; Bilson 1981). 

The usual theoretical justification for unbiased expectations is the 
assumption that risk-neutral agents (speculators) arbitrage away any 
nonzero expected payoffs. 1 This paper tests the hypothesis that ob¬ 
served deviations from unbiased expectations are consistent with a 
class of equilibrium models with risk-averse agents (e.g., Brock 1978; 
Lucas 1978; Breeden 1979). When applied to the forward foreign 
exchange market, these models predict that UEH will not hold, in 
general, because of the existence of equilibrium risk premia (see 
Fama and Farber 1979; Lucas 1981; Stulz 1981). In particular, the 
models imply that the risk premia in forward prices should be identical 
to the risk premia differential in the real returns on default-free 
nominal bonds denominated in the respective currencies. This impli¬ 
cation leads to testable restrictions on the parameters of a system of 
regression equations. 

Section II covers, in more detail, the unbiased expectations hy¬ 
pothesis, tests of the hypothesis, and the nature of the testable restric¬ 
tions implied by the relation between deviations from UEH in the 
forward market and real return differentials in the bond market. 
Section III contains a description of the data, a discussion of the 
estimation and testing procedures, and the test results. The data are 
consistent with an economy in which risk premia enter symmetrically 
in the forward exchange market and the bond market. The tests are 
not tests of a particular asset-pricing model. Rather, they test whether 
the data are consistent with a wide class of rational models of asset 
pricing. Section IV gives a summary, conclusions, and possible exten¬ 
sions of this work. 

II. Expected Real Interest Rates and Forward 
Prices 

I he unbiased expectations hypothesis can be tested in a variety of 
ways. Examples of such tests can be found in Cornell (1977), Frenkel 

' However, it is not clear why agents would have utilities that are linear in nominal 
payoffs when relative prices are stochastic. 
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(1977, 1981), Hansen and Hodrick (1980, 1983), Agmon and 
Amihud (1981), and Bilson (1981). One standard method is to test for 
zero correlation of the percentage forecast error with some subset of 
4>,. This can be done by testing whether or not 0 = 0 in the regression 
equation: 

•?«+1 - gt - X,0 + € ( + 1 , (2) 

where: 2 

*/ + 1 = 1 n St + ]; 
gt = 1 n C,„ 

1 ) = 0; 

X, 6 <S> ( . 

T wo examples of applications of (2) are Bilson (1981) and Hansen 
and Hodrick (1980). In the former study X, is the “forward premium” 
at time t (g, — s,) and a constant, while in the latter study X, contains 
lagged values of the forecast errors. In both studies, the hypothesis 
that 0 = 0 is strongly rejected. 

By using the simple interest rate parity (IRP) arbitrage condition 
(defined in [3] below) we can derive an expression for E(S , 4 1 — g,l<M 
and show the conditions under which this expectation is zero. The 
following notation will be used: 

P, = price, at time t, of a default-free discount bond paying one 
dollar at time t +• 1; 

Pf = price (in units of foreign currency) of a default-free dis¬ 
count bond paying one unit of foreign currency at time t 
+ 1 ; 

R, +1 = ln(l/P<) — 1 = the continuously compounded yield (from 
r to f -+- 1) on a 1-period discount bond denominated in 
dollars; 

Rfn = ln(l/P ! f) — 1 = the continuously compounded yield on a 
1-period discount bond denominated in the foreign cur¬ 
rency; 

■n, — the domestic price level at time t (trf = foreign price level); 
I,+ 1 = domestic inflation from t to t + 1 = Inf^ + i/tr,) (/*+1 — 
foreign inflation). 

In order to avoid arbitrage profits the following relation must hold: 

G, = S, ■ (3) 

* t 

2 In general, lower case letters are used 10 represent logarithms of the corresponding 
upper case vanables. An exception is the case of interest rates where lower case denotes 
real instead of nominal rates. 
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This is the IRP theorem. 3 Empirical evidence seems to support IRP 
(see McCormick 1979). By manipulating (3) one can show that the 
unbiased expectations hypothesis of forward prices is intimately 
linked to expected differences, across countries, in real rates of return 
on nominal bonds and expected deviations from purchasing power 
parity (PPP). 4 

We can express the exchange rate as 


S, = k^- ■ D„ (4) 

where D, = the deviation from PPP at time l (for D, different from 
unity), and k = a constant of proportionality. By combining (3) and 
(4) we get an expression for the ex post difference between ln($, + i) 
and ln(G,): 

$t +1 — gt = 4+i + f?+ i + d t ± i — d t — (/?<+] — Rf+ i). (5) 

From the Fisher equation we can express the nominal yields as the 
sum of expected real returns and expected inflation, 

Rt+ l = E(?,-n) + £(/<+,). ^ 

R? +l = £(f,\,) + £(/*+ i), 

where r, + i = the real return on a default-free nominal bond matur¬ 
ing at t + 1. From (5) and (6) we have 

■<t +1 = E(ff+ i - f ,+,) + [4 + i - £(/ < + i)] 

- [/?+, - £(/>+,)] + 4 + , - + 6 . ( ° 

If the market makes rational forecasts of 4+ i and 4*+ 1 , the expectation 
of s t + |, conditional on 4>,, is given by 

E(s t + i) = E(f?+ i - f, + l ) + £(d,+ i - d t ) + g,. (8) 


3 The IRP theorem requires that two investments that have the same future payoffs 
must have the same cost. The cost of obtaining a $1.00 future payoff through investing 
in default-free domestic bonds is P,. We can also obtain a $1.00 future payoff by 
entering into a forward contract for $1.00 (in exchange for 1 1C, units of foreign cur¬ 
rency at time l + I). We can guarantee having 1/G, units ol foreign currency next 
period by investing Pf/G, units in default-free foreign bonds now, at a dollar cost of 

In order to avoid arbitrage the two costs must be the same (i.e., P, = Sff/G,). 
This gives (3). 

4 If commodity markets are internationally open, arbitrage should guarantee that the 
law of one price holds (within the bounds of transportation costs and time). That is, the 
foreign price of a good, when translated at the current exchange rate, should equal the 
domestic price of the good. If this logic is carried over to bundles of goods, we have a 
relation between the prices of those bundles (i.e., the CPI): ir?S, = kit,. Here the 
constant k is needed since the base periods for the price indexes may differ. Thus, 
values of D„ in (4), different from unity represent deviations from PPP. Similarly, 
values of d , = In D, different from zero represent deviations from PPP. 
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This implies that g t is not an unbiased forecast of s l+ t unless (i) ex¬ 
pected real returns on nominal bonds are equal across currencies and 
(ii) E(d, + i) = d,. 5 Recent evidence indicates that condition (i) does not 
seem to hold (see, e g., Cumby and Obstfeld 1982; Mark 1982; 
Mishkin 1982a, 1982*). 

There are two major reasons why expected real interest rates may 
fail to be equal across currencies. The policy implications of these 
reasons are quite different. Note that f, +.) and f ,* + ! are the real returns 
on nominal bonds denominated in dollars and foreign currency, re¬ 
spectively. They are not riskless in real terms. In a world with risk- 
averse investors, differences in risk will lead to differences in ex¬ 
pected returns (where the appropriate measure of risk depends on 
the particular model of equilibrium). 0 In this case activist policies are 
not warranted since risk-adjusted expected real returns are equal 
across currencies. 

A differential in expected real returns across countries may also be 
due to market segmentation or barriers to capital movements across 
currencies. In this case the monetary authorities may have the ability 
to influence real savings/investment decisions by influencing the ex 
ante real rate of interest. The market segmentation or barriers, for 
whatever reason they exist, prevent complete arbitrage across inter¬ 
national markets. This does not seem to be relevant for the interest 
rates used here since Eurobonds are offshore deposits not subject to 
capital controls. 

Condition (ii) is equivalent to Roll's (1979) “efficient market version 
of PPP” (his eq. [6.7]) and is less restrictive than requiring exact PPP 
(i.e., d, = 0, for all t). While exact PPP implies that the ex post real 
return on an asset is independent of the country of residence, condi¬ 
tion (ii) implies only that the ex ante real rate is independent of 
residence. Investigation of the univariate time-series properties of the 
d, series seems to indicate that monthly percentage changes in devia¬ 
tions from PPP are serially uncorrelated. Also, the evidence in Roll 
(1979) and Adler and Lehmann (1983) is supportive of this ex ante 
version of PPP. There is some debate about whether the martingale 
property of the d, series is reasonable on a theoretical basis. The 
existence of international commodity arbitrage would lead one to 
expect that there should be mean reversion in the d, series. However, 
arbitrage in the international financial markets can lead to the martin¬ 
gale property (see, e.g., Adler and Lehmann 1983). This is discussed 
in more detail in Section III. 

1 Unbiased expectations would also hold if E(rt, i - f,„ t ) = -£(d, + , - d,), for all t. 
This is likely to happen only by chance. 

G In general, risk is related to the covariability between marginal utility of consump¬ 
tion and the real return on the asset (or equivalently the covariability between the 
marginal utility of currency and the nominal return of the asset). 
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Failure of UEH may also be due to market inefficiency (irrational¬ 
ity). That is, the market’s expectations may not equal the true condi¬ 
tional expectation. In this case there will be an ex post “bias” due to an 
inefficient market. The tests outlined in Section III below will help 
determine the relative importance of expected real return differen¬ 
tials and deviations from PPP as explanations of the observed bias in 
forward prices. 7 

If one assumes that £(<?,+ ,I4>,) = d„ equation (8) implies that fore¬ 
castable deviations from unbiased expectations should be equal to the 
forecastable difference in real returns across nominal bonds de¬ 
nominated in different currencies. That is, if X, is a subset of 4>„ then 
the following equality should hold: 

£<£(+1 - gt\Xr) = - b+i|X,). (9) 

Assuming that the expected real return differential is observable and 
that Z, is a subset of X,, then the following regression is estimable: 

fi+i - g, = 0 (i + 0i£(f,*+i - f t+l |x,) + e 2 z, + -n, + i. (io) 

I'he theory implies that 0 O and ©«, should equal zero and 0| should 
equal unity. Rejection of these restrictions would indicate that risk 
premia are not the only reason for observed deviations from unbiased 
expectations. Of course any test is a joint test of the null hypothesis 
and any additional assumptions used to make the hypothesis testable. 
For example, the formulation leading to the restrictions makes use of 
the assumption that £((?,+1 - d,l<f>,) = 0. Hence, rejection of the 
restrictions is a rejection of the joint hypothesis that (i) risk premia are 
the cause of deviations from UEH and (ii) £(<2,+ 1 - d, l<j>,) = 0. 

Even though the expected real return differential is not observable, 
consistent estimates of the parameter vector, 0, may be obtained by 
using instrumental variables in the estimation procedure. Three-stage 
least squares (3SLS) is used here. The estimators are discussed in 
more detail below. 

The formulation of the testable restrictions in (10) is quite general 
in that it is consistent with a number of different asset-pricing 
scenarios. It can accommodate time variation in premia due to 
changes in risk as well as movement in the price of risk. I'his can be 
contrasted with the latent variable model of Hansen and Hod rick 
(1983) and Hodrick and Srivastava (1984), which allows time variation 
in risk premia due to time variation in the expected excess return on a 
benchmark portfolio (the price of risk), but it does not allow time 
variation in premia due to changes in conditional covariances (risks). 
Unfortunately, this “advantage" of a more general formulation comes 

7 The same type of expression as (8) can be obtained tor futures prices with 
some slight modifications (see Cox, Ingcrsoll, and Ross 1981). 
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at the cost of lowering the power of the tests against specific alterna¬ 
tive hypotheses. Since many of the more highly parameterized alter¬ 
native hypotheses have been rejected this simpler approach may be 
useful. 


III. Empirical Results 

A. Data and Summary Statistics 

Monthly data for the 1-month forward prices and the spot prices of 
eight currencies are from the Weekly Review of the Harris Trust and 
Savings Bank. Data from March 1974 through September 1979 are 
from the Harris Bank Data Base, supported by the Center for Studies 
in International Finance, University of Chicago. Data for October 
1979 through July 1981 were hand collected from the Weekly Review. 
The eight currencies in the sample are the British pound, Canadian 
dollar, French franc, Swiss franc, lira, deutsche mark, guilder, and 
Belgian f ranc. Prices are those quoted on the last Friday of the month. 
Corresponding Eurocurrency rale data are also obtained from ihe 
Harris Bank Data Base. Although T-bill rates are available, the 
Eurocurrency rates are used here because they are relatively free 
from capital contiols. Price level data for each country are taken from 
International Financial Statistics. The consumer price index (CPI) is 
used to construct inflation series.” 

Sample autocorrelations of spot and 1-month forward prices indi¬ 
cate that the series are nonstationary in the mean. This is consistent 
with the results of Meese and Singleton (1982). The sample autocor¬ 
relations of the “forecast errors” (i, + t — g>) indicate that mean non- 
stationarity is not a problem. Table 1 contains summary statistics for 
(1) the 1 -month forecast errors (s,+ i — g,)\ (2) the difference between 
the foreign and domestic real returns on Eurobonds; and (3) the 
difference between (1) and (2). The means of the individual series are 
generally small and insignificantly different from zero. Also, the vari¬ 
ability in the “forecast error,” s,+ i — g>, is substantially larger than the 
variability in the ex post real interest rate differential. 

Two tests of normality are reported in table 1. The first test statistic 
is the studentized range, which is relatively powerful against “fat¬ 
tailed” distributions often found in financial data. Using this statistic, 
one can reject (at the 5 percent level) normality for four of the fore- 


K Because the CPI is constructed from goods prices sampled throughout the month, 
it provides a closer measure of inflation from midmonlh to midmonth rather than end- 
of-month inflation. Use of end-of-month forward and spot prices causes some mis¬ 
matching of lime periods. A number of the tests were also performed using midmonlh 
exchange rale data without any important changes in the results. 
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TABLE 2 


Hotelling T 2 Test: Testing Whether All Means Equal Zero 


Senes 


T 2 

Vf l “ 

Ki 

27 68** 

fKi - 

f,< L 

.80.94** 

(l- i - 

g,) - (E* 1 - f,< l) 

14.02 


•Sigtufkaru at the JO level 
••Significant at the 0*> level 


cast error series, none of the real rate series, and four of the series 
formed by taking the difference between the forecast error and the 
real rate differential. Of course, these statistics are not independent. 
The second test statistic is the Kolmogorov-Smirnov (K-S) statistic, 
which rejects normality (at the 5 percent level) in three of the forecast 
error series, three of the real rate series, and four of the forecast 
error/real rate difference series. Thus, there seems to be some devia¬ 
tion from normality possibly due to the fat-tailed distributions. The 
effect of nonnormality on hypothesis tests is discussed in more detail 
below. 

As a first pass it is instructive to determine if there is any average 
deviation from unbiased expectations. A Hotelling ^''-statistic is used 
to test jointly whether the means of the series in table 1 are equal to 
zero for all countries (see table 2). Under UEH the means of the 
forecast errors, the real rale differentials, and the difference between 
the two should be zero across all countries. The alternative model of 
Section II does not require that the mean forecast error or the real 
rate differential be zero, but the mean difference between them 
should be zero across all countries. The T 2 -tests in table 2 reject (at the 
5 percent level) the hypothesis that the means of the forecast errors 
are zero for all countries. One can also reject the hypothesis that the 
mean real interest rate differential is zero for every country. How¬ 
ever, one cannot reject the hypothesis that the mean difference be¬ 
tween the forecast error and the real interest rate differential is zero 
for every country. Thus the T^-statistics are inconsistent with UEH 
but are consistent with the model in Section II in which average real 
interest rate differentials must equal the average deviations from 
UEH. 

B. Snuill-Sample Properties of Test Statistics 

As noted earlier, there may be reason for concern regarding the 
distribution of test statistics reported below. Even if the regression 
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errors are normally distributed, we know only the asymptotic distri¬ 
bution of the test statistics in the seemingly unrelated regression 
(SUR) and 3SLS models when the covariance matrix is estimated (£) 
rather than known except for some special cases (see, e.g., Zellner 
1963; Srivastava 1970; Mariano 1982). Given that data are available 
for fewer than 100 months it is possible that the small-sample distri¬ 
butions of the test statistics, reported below, are different from their 
asymptotic distributions. This problem is further complicated by the 
possible deviations from normality that are evident in table 1. The 
asymptotic distribution will not be affected as long as the “fat-tailed” 
distributions have finite variances (e.g., student-! rather than Pareto- 
Levy distributions), but convergence to the asymptotic distribution is 
likely to be less rapid. 9 Thus, the leptokurtic properties of the error 
distributions are likely to exacerbate the small-sample problem. 

Bootstrap (see Efron 1982) and Monte Carlo simulations are used 
here to investigate the small-sample properties of the test statistics 
and to separate the effects of nonnormality and small sample size. 
The results indicate that the small-sample distributions of the test 
statistics are significantly different from the asymptotic distribution 
and that, at least in the cases studied here, there is a pronounced bias 
against the null hypothesis. 

The bootstrap is a nonparametric procedure in which the errors in 
a regression equation are assumed to be drawn from some 
unspecified distribution with distribution function F. In the mul¬ 
tivariate regression case it is assumed that for each period a vector of 
errors is drawn from the joint distribution function F. We can get an 
estimate of this distribution from the residuals of the original regres¬ 
sion, the “empirical’' error distribution. The empirical error distribu¬ 
tion, F, is constructed by placing mass l/N on e„ where e, is the vector 

of regression residuals at time period t, t = 1. N. From this 

empirical error distribution one can estimate the small-sample distri¬ 
bution of the test statistics, under the null hypothesis, by (i) sampling 
from the empirical error distribution with replacement, (ii) construct¬ 
ing regression equations such that the null hypothesis is true, and (iii) 
reestimating the regression and calculating the desired statistics. This 
is done repeatedly, and we obtain a distribution for each test statistic. 
From this distribution, percentiles of the test statistic can be cal¬ 
culated. The theory underlying the construction of confidence re¬ 
gions is not well developed except in special cases (see Efron 1982, 


9 Islam (1981) presents evidence that suggests that the student-! distribution is more 
reasonable than the Paretian for exchange rale data. The asymptotic distributions of a 
variety of estimators without normality are derived in Burguete, Gallant, and Souza 
(1982). 
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chap. 10), but it has yielded useful results in less complicated prob¬ 
lems than the ones presented here. 

There are several important questions about the bootstrap proce¬ 
dure that should be mentioned. Much of the uncertainty about how 
good the bootstrap estimates are revolves around the use of F to 
estimate the true distribution, F. There are a number of cases where 
one might suspect F to be far from F, especially in the regression case 
with fat-tailed distributions. In a least-squares regression, outliers can 
significantly influence the regression coefficients and, hence, the esti¬ 
mates of the regression errors (i.e., the empirical error distribution 
used in the bootstrap). The nonnormal (fat-tailed) distributions often 
found in financial data are more likely to exhibit such outliers. An 
extreme example of this is the so-called peso problem, in which there 
is a small chance of a large (in absolute value) error term (see Krasker 
1980). If the sample size is large relative to the time between rare 
events, then F should be close to F. However, if the sample siz.e is 
small, then F may be a poor approximation of F. Freedman and 
l’eters (1982) find that the bootstrap estimates of the standard errors 
of regression coefficients, in an SUR model, tend to be understated 
(although they are closer to the true standard errors than are ihe 
standard errors from the asymptotic distribution). 

Other instances when F may be far from F are when a substantial 
amount of overfilling or pretesting is done. As noted in Efron (1982, 
p. 36), the bootstrap procedure “depends on F being a reasonable 
estimate of F, and can give falsely optimistic results if we are fitting 
highly overparameterized models in hopes of finding a good one.” 
For example, if one has 20 observations on (y„ X,) and regresses y, on a 
constant and a nineteenth-degree polynomial in X„ F will not have 
any resemblance to F. Although this is an extreme case, overfitting 
will in general understate the true variability of regression errors. 

In addition to the bootstrap, Monte Carlo simulations are used to 
investigate the small-sample properties of the test statistics for each 
test reported below, assuming the errors in the regressions are nor¬ 
mally distributed. That is, rather than use the original regression 
residuals to estimate F, a random number generator is used to con¬ 
struct multivariate normal errors. 1,1 The difference between the 
Monte Carlo distribution and the asymptotic distribution of the test 
statistics will be due to the small sample size (within sampling error), 
while the difference between the bootstrap distribution and the 
Monte Carlo distribution of the statistics will be due to the difference 


Multivariate normal random vectors are created with the GGNSM subroutine of 
the International Mathematical and Statistical Library. The covariance matrix is equal 
to the estimated covariance matrix from the initial regressions. 
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between the empirical error distribution and the normal errors used 
in the Monte Carlo simulations. 

C. Interest Rate Differentials and Forward Prices: Test 
Results 

Deviations from UEH due to serial correlation in the forecast error 
series have been documented convincingly in Hansen and Hodrick 
(1980, 1983). 11 The results reported below show that these results are 
primarily due to correlation between past forecast errors and shifts in 
the expected real return differential across bonds denominated in 
different currencies. That is, the observed deviations from UEH are 
explained by the systematic relation, of the type predicted in Section 
II, between the forecast error and the regression of the real interest 
rate differential on past information. 

In Section II it was shown that, under rational expectations, the 
difference between the log of the spot price at time t + 1 and the log 
of the forward price at t is given by 

i/+i ~ gi = E(r,*+ i - f (+ ,) + r\ t+i , 

where £(fj, + [l<|>,) = 0, assuming E(d,+ i — d,l()>,) = 0. If one had 
observations on E (f,* + [ — f l+ \) it would be possible to estimate the 
model 

s,+ i - St = 8o + 8,£(f,* + , - f, + 1 ) + f), + 1 . (11) 

Market efficiency would require that 8 0 = 0,5, = 1, and£(f|,+ il<f>/) = 
0. Unfortunately, the expected differential in real returns on nominal 
bonds is not observable. One method of proceeding is to use the 
realized value of the random variable as a proxy for the expected 
value. With rational expectations, the realized value is equal to the 
conditional expectation plus an error term that is uncorrelated with 
past information, <}>,: 

r?+ i - *7+i = E (ff + , 

E(e,+ !|d>,) 

Combining (11) and (12), we have: 

s t +i ~ gt = 8 0 + 8i(f?+i - r,+ 1 ) + e, + ) , 

€< + 1 = T)(+ I ~ 8iC, + 1. 


ftn) + <7+1. 
0 . 


( 12 ) 


11 Hansen and Hodrick use overlapping data with OLS estimators. (The standard 
errors are corrected for the moving average error structure.) Korajczyk (1983) docu¬ 
ments the serial correlation using nonoverlapping data with multivariate regression 
estimators. 
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This formulation results in error terms that are correlated with the 
independent variables; that is, 

cov(e, + ,, f* + , - r, + ,) = —8, var(e, + i) + cov(e, + ,, -n, + ,). 

The correlation between t, + 1 and f,* + , - f, + t implies that OLS esti¬ 
mates of 8 are inconsistent. 

Three-stage least squares (3SLS) is used to obtain consistent esti¬ 
mates of 8 in (13). The instruments are chosen on the basis of their 
documented correlation with future real interest rates. The instru¬ 
ments used in the first stage are (I) a constant; (2) the average real 
interest rate differential (rf — r,) over the preceding 12 months; (3) 
the difference in the inflation rate between the United States and the 
foreign country in the previous month; (4) the difference (across 
countries) of the sample standard deviation in nominal interest rates 
over the preceding 26 weeks; (5) the lagged value of the dependent 
variable; and (6) the “forward premium” (g t — s t ) at the beginning of 
the month. Variable 2 was chosen since it was shown to be a rea¬ 
sonable model for U.S. real interest rates (as an approximation to an 
ARIMA [0, 1, 1] model) in Fama and Gibbons (1982). An ARIMA 
model of the real interest differential series indicates that the IMA (1, 
1) specification was reasonable for some but not all of the series. Even 
though a single specification did not fit all of the series, only one was 
used in order to reduce the chances of overfitting the data. Variables 
3 and 6 were included since they have been shown to be useful in 
predicting interest differentials (see Mark 1982; Mishkin 1982a, 
19826). Variable 4 was included since nominal interest rate variability 
seems to explain risk premia in the U.S. Treasury bill market (Fama 
19766). 12 

The first-stage regressions explain between 5 and 35 percent of the 
real interest rate differential (unadjusted Ft 2 ). The average R 2 across 
the eight cross sections is 18 percent. This explanatory power is some¬ 
what less than the results in Mark (1982), which have unadjusted R 2 
values ranging from 33 to 64 percent. However, Mark’s regressions 
generally include many more independent variables (e.g., his regres¬ 
sion with R 2 = .64 has 33 RHS variables). The number of instruments 
used in my 3SLS procedure is kept small in order to avoid using too 
many degrees of freedom through overfitting the data. 

The 3SLS estimates of (13) are given in tables 3 and 4. In table 3 the 

12 Implicit in the 3SLS framework is the assumption that the relation between the 
instruments and the real return differential is stationary (i.e., the regression parame¬ 
ters are fixed). There is nothing in the formulation that assures this property since the 
first-stage regressions are not derived from a fundamental utility-maximizing model 
with fixed underlying parameters Because of this we cannot rule out (theoretically) the 
possibility of time-varying coefficients due to regime changes as discussed in Lucas 
(1976). 
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TABLE 3 

Unrestricted Instrumental Variables Estimates 
(April 1974-December 1980) 

i - gi = 8 U + 8,(r,* i - f,.i) + 

A. Estimates 


Country 

8„ 

Si 

\(2) for 

8„ = 0. 5, = ! 

D-W" 

UK 

.010 

3.10 

5.38 

1.85 


(.005) 

(1.51) 



CA 

- .002 

1.78 

3.04 

1 94 


(.001) 

(102) 



BE 

.003 

.64 

1.31 

1.99 


(.004) 

(.48) 



FR 

.001 

2.80 

3.98 

2.37 


(.004) 

(1.14) 



GE 

-.001 

.21 

40 

1.96 


(.005) 

(1,77) 



IT 

-.001 

2.43** 

9.41** 

2.28 


(.003) 

(.57) 



NE 

.000 

.74 

48 

2.19 


(.004) 

(.47) 



sw 

.001 

.84 

15 

2.04 


(.005) 

(1 56) 




B T est Statistics 





P- Value 

Monte 

P-Value 

Test 

X 

0/ 

Asymptotic 

Carlo 

Bootstrap 

So = 0: all equations 

15.06 

8 

.058 

084 

.137 

S i: equal across equations 

15.54 

7 

.030 

.007 

.090 

8[ = 1: all equations 

8 U = 0 and 

18.00 

8 

.021 

.057 

.08.3 

5i = 1: all equations 

28.63 

16 

.027 

.097 

.130 


Non —Asymptotic standard errors in parentheses. SSLS estimates, weighted R J = 07 
J l)-W statistic from hrst-slage regressions 

•Significantly different from the null at the 05 level, using asymptotic standatd errors 
••Significantly different from the null at the 01 level, using asymptotic standard errors 


estimates are unconstrained while in table 4 the estimates of 8| are 
constrained to be equal across equations. In table 3 the data for only 
one of the countries, Italy, give strong evidence against the hypothesis 
that 8o = 0 and 8i = 1. The test statistics (X) for joint hypotheses 
across equations are given in part B of the table. 13 There are four tests 

1S The test statistic, X, for the iinear restriction r = K0 (where 6 is the parameter 
vector) is given by: 

K (r - ReriR cov(8)f?'] ‘(r - R 8) 
i' cov(C) '« 

where cov(8) = the estimate of the covariance matrix of 6; # = the vector of regression 
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TABLE 4 

Restricted Instrumental Variables Estimates 
(April 1974—December 1980) 

. 1 - K, = S„ + 8,<r,\ , ~ f,) + l, t , 


Country 

8„ 



8. 

UK 

.005 



1.37 


(.004) 



(.29) 

CA 

- .002 



1.37 


(.001) 



(29) 

BE 

.001 



1.37 


( 004) 



(.29) 

FK 

.002 



1 37 


(004) 



(.29) 

c;e 

- 001 



1.37 


( 004) 



(.29) 

ir 

.1)00 



1.37 


(.003) 



(29) 

NE 

.000 



1.37 


< 004) 



(.29) 

sw 

002 



1.37 


( 004) 



(•29) 





P-Value 

1 CM 

X 

i>f 


Asymptotic. 

h,| 0 rill f a f|U,lll<>m 

JO 

H 


2 1 

fin • 0. fi| = i all < tju.tltcms 

V2 Oif 

0 


17 

No if - Atvrii|iiniK Muiulurd error* in 

|).trnttheses, weighted *■ 

03 




in table 3: (1) the intercepts are zero for all equations; (2) the slope 
coefficients are the same across equations (this is the constraint 
imposed in table 4); (3) the slope coefficients are all equal to unity; 
and (4) the intercepts are zero and the slopes are unity for all cross 
sections. 

If one uses the asymptotic distribution of the test statistics and 5 
percent as the size of the test, one accepts the hypothesis that S () = 0 
for all equations but rejects the remaining three hypotheses (see table 
3). However, the Monte Carlo and bootstrap estimates of the distribu¬ 
tion of the test statistics are quite different from the asymptotic distri¬ 
bution, especially for tests involving the slope coefficients. This can be 
seen in figures 1-4, which compare the bootstrap distributions with 
the asymptotic distributions for all four tests. A K-S test rejects the 
hypotheses that the bootstrap distribution and the Monte Carlo distri- 


residuals; cov i = the estimate of the covariance matrix of e; and K = the number of 
degrees of freedom. The asymptotic distribution of \ is x 2 (?). where r/ is the number of 
restrictions being tested (see Thcil 1971, pp. 402, 508-13). 










Fig. 1. —Testing whether all intercepts equal zero. Bootstrap distribution versus null 
distribution of test statistics for linear restrictions. Solid curve is the null distribution. 
Needles represent the bootstrap distribution. Three hundred replications of SSLS 
regressions for eight equations with 82 observations per equation. 



Fic. 2 —Testing whether all slopes are equal (see fig. I n.) 
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tuition are drawn from the asymptotic distribution even though the 
bootstrap/Monte Carlo regressions are constructed such that the null 
hypothesis is true. {A two-sample K-S test is used to test the hy¬ 
pothesis that the bootstrap and Monte Carlo distributions are drawn 
from the same distribution. This hypothesis is not rejected, perhaps 
because the test is not very powerful.) 

The simulation results indicate that we would reject the null hy¬ 
potheses (at the 5 percent level) approximately 10—15 percent ol the 
time when they are true. Thus we will reject the null hypothesis too 
often when the asymptotic distribution is used. T his is illustrated in 
figures 1-4, where we can see that the asymptotic distribution implies 
a much lower probability of large values of the test statistic than the 
actual distribution. If we use the percentiles of the Monte Carlo or 
bootstrap distributions, then not one of the joint hypotheses is re¬ 
jected at the 5 percent level. 

Since the hypothesis that 81 is equal across equations is accepted 
(using the estimates of the small-sample distribution of the test statis¬ 
tic), this constraint is imposed in table 4. Imposition of the constraint 
provides a more precise estimate of 8 f and more powerful tests of the 
hypotheses (assuming it is true). The constrained estimate of 81 is 1.37 
with an asymptotic standard error of 0.29. The hypotheses that (1) 8 () 
= 0 for all equations and ( 2 ) 8 0 = 0 and 8 | = 1 for all equations are 
not rejected even using the asymptotic distribution. Since the actual 
asymptotic standard error of this pretest estimator is not the same as 
the asymptotic standard error estimate assuming no pretesting (i.e., 
0.29), the standard errors in table 4 should be interpreted with cau¬ 
tion. 

The results in tables 3 and 4 indicate that at least some of the 
forecastable deviations from UEH can be explained by ex ante differ¬ 
entials in real returns (on nominal bonds) across currencies. The pa¬ 
rameter estimates are consistent with the formulation in Section II 
(i.e., 80 = 0» 5i = 1) and indicate an almost one-for-one (1.37 with 
standard error 0.29) relation between the projection of the interest 
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rate differential on past information and the deviation between the 
current forward price and the spot price next period. Obviously the 
real interest rate differential does not explain much of the variation 
(weighted R 2 of .07 or .05), but the explanatory power is statistically 
significant (i.e., one can reject 8„ = 8j = 0 using any of the three 
distributions for the test statistics). 11 

One can test whether the autocorrelation of the “forecast errors” 
found by Hansen and Hodrick (1980, 1983) is accounted for by the ex 
ante real interest rate differential by testing (0o, 0!, 0 2 ) = (0, 1, 0) in 
the following regression: 

4+1 _ gt = 9o + 0 |(i?+i ~ 94 i) + 62(9 - gt- i) + & + 1 . (H) 

The results of this regression are reported in table 5 (with 0| con¬ 
strained to be equal across equations). In addition, in table 5 the test 
statistics for various joint hypotheses (using the unconstrained esti¬ 
mates) are presented along with the ^-values associated with the 
asymptotic, Monte Carlo, and bootstrap distributions of the test statis¬ 
tic. T he hypothesis that the intercepts are all zero and the hypothesis 
that the values of 0, are equal across equations are not rejected (at the 
5 percent level) using any of the alternative distributions. The hy¬ 
pothesis that 0, = 1 for all equations is rejected using the asymptotic 
distribution but is not rejected using the Monte Carlo or bootstrap 
distributions. The hypothesis that 0_, = 0 for all equations and the 
hypothesis that 0„ = 0, 0, = 1, and 0 2 = 0 for all equations are 
rejected using the asymptotic distribution and are on the boundary ot 
the critical region using the Monte Carlo and bootstrap distribution. 
Given that the evidence in Freedman and Peters (1982) indicates that 
the bootstrap tends to understate the true standard errors, it is likely 
that the true p-values are larger (less significant) than those reported 
by the bootstrap. A comparison of the asymptotic, bootstrap, and 
Monte Carlo distributions of \ is given in table 6. 

The results indicate that the highly significant autocorrelation 
found in Hansen and Hodrick (1980, 1983) and Korajczyk (1983) 
seems to be a result of time-series variation in the ex ante real interest 
rate differentials. Some autocorrelation remains, although it is barely 
significant (using the estimates of the small-sample distribution of the 
test statistics). This remaining autocorrelation could be due to a nura- 


14 The model (13) is also estimated using 3-month and 6-month forward prices. 
Standard 3SLS routines will not produce accurate standard errors because of the mov¬ 
ing average error structure induced by the overlapping data. Rather, the two-step, two- 
stage least-squares (2S2SLS) estimator of Cumby, Huizinga, and Obstfeld (1983) is 
used. The 2S2SLS estimates of the parameter vector 8 in (13) are quite imprecise. 
Although the hypothesis that 8„ = 0 and 8, = 1 for all equations is not rejected, neither 
is the hypothesis that 8 0 = 8| = 0. That is, the tests have very low power. For this reason 
the results are not presented here. 
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TABLE 5 

ReSIRK.TED InSI RUMEN 1AI. VaKIAR 1.ES Esl IMA'l'ES INCLUDING AUTOREGRESSIVE TERM 
(April 1974—December 1980) 

b* 1 - gi = + 8i(r*. i ~ 9 + ,) + 0 2 (s, - i) + in 

A. EsilMAirs 


Country 

«„ 


0i 

0.., 

D-W a 

UK 

006 


1.57 

.22 

2 00 


(.004) 


(.37) 

( ID 


CA 

- .002 


1 57 

10 

2.02 


(.001) 


(.37) 

(.12) 


BE 

.000 


1.57 

.13 

1.90 


( 004) 


(-37) 

(.07) 


FR 

.001 


1 57 

01 

2.07 


(.004) 


(.37) 

(10) 


OF 

- .002 


1 57 

11 

1.86 


(.004) 


( 37) 

( 10) 


n 

000 


1.57 

01 

1 96 


(.004) 


(37) 

( 09) 


NF 

.000 


1 57 

04 

I 86 


( 004) 


(.37) 

(09) 


sw 

.002 


1.57 

-.04 

1.94 

___ _ _ 

( 005) 

_ _ 

(.37) 

(08) 

_ 


1$. 

Teai 

Si a i i.snc.s 




-=-=^=—_ 



/’-Value 





/’-Value 

Monte 

/’-Value 

I CM 

A 

tit 

Asymptotic 

Carlo 

Bootstrap 

0 ( i ~ 0, ail equations 

10.83 

8 

.212 

.350 

353 

0i, equal across equations 

13.72 

7 

056 

. 15.3 

170 

0i = 1, all equations 

17 86 

8 

.022 

.08.3 

.130 

02 = 0, all equations 

20.27 

8 

009 

.053 

047 

0„, 0., - 0, anti 






Hi = I, all equations 

47.72 

24 

003 

050 

040 

Nim — Wrigbietl /#* = l)K 







“D-VV sMdAiK from firM-Mdj^r ro^tsii<>m 


her of factors besides random error. First of all, errors in measuring 
the real interest rate differential (most likely through measurement 
errors in the consumer price indexes) may prevent the first-stage 
instrumental variables regressions from picking up all of the informa¬ 
tion in the instruments. In addition, the instruments may not incorpo¬ 
rate all of the relevant information useful in predicting the real re¬ 
turn differential (since instruments X, were used rather than <f>,). In 
essence, the error in using the fitted values from the first-stage regres¬ 
sion instead of the true value of £{ff +i — f t+i lcj>,) may be autocor- 
related, and this is picked up in the second-stage regressions. 

Another possible source of error is the assumption that deviations 
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TABLE 6 

Comparison of Bootstrap, Monte Carlo, and Asymptotic Distribution of Test 

Statistics 

A. Ninety-fifth Percentiles for Distribution of X 


Test 

Of 

Bootstrap 

Monte Carlo 

Asymptotic 

■'<( 1 - gi 

= 6„ 

+ 8i(r,», , - r, + 1 

) + i + . 


8„ = 0, all i 

8 

18.5 

16.4 

15.5 

5] equal, all 1 

7 

18.9 

16.9 

14.1 

8, = 1, all 1 

8 

21.4 

19.0 

15.5 

8 0 = 0, 8, = 1, all i 

16 

33.3 

32.0 

26.3 

r ,.1 - g, - w.i + 

0.(0*, 

1 - f,+ ,) + 0j(s. 

r ~ gi~ l) + 

1 

fl u = 0, all 1 

8 

19.0 

176 

15.5 

0 1 equal, all 1 

7 

21.6 

17.6 

14.1 

6] = 1, all t 

8 

23.0 

19.0 

15.5 

0 2 = 0, all 1 

8 

20.2 

21.2 

15.5 

fl () = 0, 9, = 1, all 1 

16 

34.5 

32.8 

26.3 

0 O = 0, 0, = 1, O 2 = «, all i 

24 

46.4 

47.7 

36.4 


B. Kolmogorov-Smirnov Tests for Equalit y of Distributions 


Test 


Bootstrap = 
Asymptotic 

Monte Carlo = 
Asymptotic 

Bootstrap = 
Monte Carlo 


5+1 - gt 

= 80 + 8 , (r 

1 - + 1 ) + i /+1 


8 „ = 0 , all 1 


.134** 

153** 

.060 

8 , equal, all 1 


.172** 

.180** 

.033 

8 , = 1 , all 1 


192** 

175** 

.043 

8 „ = 0, 8, = 

1 , all t 

.226** 

.219** 

.060 


1 - g, = 0(1 + 

1 ~ + 

) + ®*(*l ~ gr- l) + f), + 


0 „ = 0 , all 1 


162** 

.181** 

.033 

61 equal, all i 


.187** 

.174** 

040 

0 i = ball 1 


.208** 

.180** 

.030 

02 = 0 , all 1 


.507** 

.209** 

.043 

0 „ = 0 , e, = 

1 , all I 

.241** 

.245** 

.033 

0(, = 0 , 0 , = 

1,02 = 0 , all 1 

.277** 

.288** 

.030 

‘Significant at the f> percent level 
“Significant at the 1 percent level 


from PPP are well described by a martingale <i.e., £[<?,+d<M = d ( ). 
One might expect that such deviations would tend to reverse them¬ 
selves, However, reversals of this type would lead to negative autocor¬ 
relation in the forecast error series, not the observed positive autocor¬ 
relation. Also, Roll (1979) and Adler and Lehmann (1983) find 
support for the martingale hypothesis. 

In any event, the significant departure from UEH is primarily due 
to forecastable differences in real' returns on nominal bonds de- 
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nominated in different currencies. This result is consistent with a 
world in which expectations are set rationally, but where different 
levels of purchasing power risk across countries cause differing risk 
premia and, therefore, a differential in expected real interest rates. 


IV. Summary 

A number of empirical studies have found that past information can 
be used to predict the deviation of future spot exchange rates from 
current forward exchange rates. This evidence is inconsistent with the 
unbiased expectations hypothesis. When agents are risk averse, UEH 
is not an implication of market efficiency (rational expectations) when 
purchasing power risks differ across countries. The observed bias 
may be due to predictability of risk premia rather than abnormal 
returns. If risk premia are the cause of the deviations from unbiased 
expectations, then past information should be useful in predicting the 
forward forecast errors only to the extent that it is useful in predicting 
the real return differential across nominal bonds denominated in 
different currencies. 

The results of Section III show that there is a significant relation 
between the forecast error anti the projection of the real return dif¬ 
ferential on past information. Also, the past information does not 
have any explanatory power beyond its influence through the real 
return differential. 

Additionally, the simulation results presented above indicate that 
the small-sample distributions of standard test statistics can be far 
from the asymptotic distribution partially because of nonnormal er¬ 
ror terms as well as small sample sires. Adjustments for the small- 
sample bias against the null hypothesis often lead to different infer¬ 
ences from those that would be obtained using the asymptotic 
distribution. Thus, care should be taken when using asymptotic re¬ 
sults. 

As stated earlier, no specific asset-pricing model is tested here. 
Rather, the restrictions tested in Section Ill must hold if any of the 
standard models are to explain the observed deviations from UEH. A 
logical extension is to incorporate the real return differential and 
forward forecast errors into a specific asset-pricing model (possibly 
with time-varying parameters). These extensions are left for future 
endeavors. 
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The purposes of this paper are twofold. The first is to demonstrate 
that the expected utility hypothesis is a reasonable description of 
behavior for consumers who face a low-probability, high-loss natural 
hazard event, given that they have adequate information. 1 he sec¬ 
ond is to demonstrate that in California information on earthquake 
hazards was generated by a 1974 state law that created a market for 
safe housing that previously did not exist. 


I. Introduction 

In a recent survey article on expected utility theory, Schoemaker 
(1982) describes the theory as “the major paradigm in decision mak- 

The research reported here was funded by the U.S. Geological Survey. We would 
like to give special thanks to Richard Bernknopf, Edward Dyl, James Murdoch, Robert 
Wallace, Carl Wentworth, and an anonymous referee. 

(Journal of Political Economy, 1985, vol. 93, no 21 

© 1985 by The University of Chicago All rights reserved 0022*3808/85/9302-0003101 50 

3^9 



370 


JOURNAL OF POLITICAL ECONOMY 


ing since the Second World War.” But Schoemaker indicates that in 
field studies the theory has not been supported. In particular, people 
do not behave as if they are maximizing expected utility for low- 
probability, high-loss events such as natural disasters. This conclusion 
is drawn from the work by Robertson (1974), Kunreuther (1976), and 
others. For example, Kunreuther interviewed homeowners in flood 
plains and earthquake-prone areas and concluded that the expected 
utility model “provides relatively little insight into the individual 
choice process regarding the purchase of [flood and earthquake] in¬ 
surance.” 

The results in this paper are more encouraging for expected utility 
theory. An expected utility model of self-insurance that incorporates 
a hedonic price function is developed and applied to low-probability, 
high-loss earthquake hazards. Individuals can self-insure by purchas¬ 
ing houses in areas where the expected earthquake damage is rela¬ 
tively low. Our empirical results establish the existence of a hedonic 
price gradient for safety in the Los Angeles and San Francisco areas; 
ceteris paribus, individuals pay less for houses located in relatively 
hazardous areas. Moreover, the magnitude of the price gradient is 
consistent with our theoretical results when reasonable estimates of 
earthquake probabilities and potential damages are used, thereby 
lending support to the expected utility paradigm. 1 

The existence of a safety price gradient implies that individuals in 
the Los Angeles and San Francisco areas possess information on the 
relative clanger of different locations. Yet Kunreuther found that 
Californians residing in earthquake-prone areas did not purchase 
earthquake insurance, in spite of subjective values on probabilities 
and magnitudes of potential losses that suggest such insurance may 
have been desirable. Our empirics show that a 1974 law passed by the 
state of California provided information that has allowed individuals 
to self-insure. Essentially, the law’s passage created a market for safety 
that affected housing values. 

The paper is organized as follows: In Section II, a simple theoret¬ 
ical model of self-insurance that includes a hedonic price function is 
developed. Empirical results on the existence of a safety price gra¬ 
dient and the source of safety information are presented in Section 
III. Section IV demonstrates the applicability of the expected utility 
model. A review of alternative evidence and qualifications to our 
analysis follows in Section V. 


1 Our approach can be likened to that of Gould (1969), who shows that the expected 
utility hypothesis cannot be rejected as a description of behavior for consumers pur¬ 
chasing auto insurance. 
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II. Theory 

The theoretical model combines previous work on self-insurance in 
an expected utility framework with hedonic housing value analysis. 
Ehrlich and Becker (1972) discuss the acquisition of market insurance 
as a method of redistributing resources toward the less well-endowed 
states. They indicate that in lieu of market insurance, individuals may 
choose to perform a similar redistribution through self-insurance. 
The latter is therefore seen as a substitute for market-obtained insur¬ 
ance. Familiar examples of self-insurance include procuring a burglar 
alarm to thwart thieves or wearing a helmet while riding a bicycle. For 
earthquake hazards, self-insuring would entail, inter alia, locating 
one’s residence in an area of relative safety. 2 If enough consumers 
possess information on where the relatively safer areas are located, 
one would expect to see higher housing values in these areas ceteris 
paribus. Location, with regard to safety, is a housing attribute much 
the same as other attributes including structural, neighborhood, and 
community characteristics. Thus, consumers choose a level of self- 
insurance through their locational choices with respect to earthquake 
safety. 

In order to incorporate housing attributes into the self-insurance 
model, a hedonic price function similar to the type introduced by 
Rosen (1974) is utilized. Housing value studies using hedonic prices 
have proved fruitful for valuing public goods such as clean air (An¬ 
derson and Crocker 1971; Harrison and Rubinfeld 1978), social in¬ 
frastructure (Cummings, Schulze, and Mehr 1978), and noise level 
(Nelson 1979), as well as estimating prices for more traditional attri¬ 
butes such as square footage, fireplaces, and swimming pools. The 
safety attribute is novel, however, in that it is random; it enters 
the consumer’s utility function differently depending on the state of 
the world that prevails. It has a mitigating effect on damage if an 
earthquake occurs, whereas if there is no earthquake, there is no 
damage. 

The existence of a hedonic price gradient for the safety attribute 
reveals that information about natural hazards is available and that 


1 Ehrlich and Becker (1972) distinguish between self-insurance and self-protection 
The tormer reduces the loss in the event (e.g., earthquake) slate whereas the latter 
reduces the probability that the loss will occur. It might be argued that location away 
from an earthquake hazard area accomplishes either or both of these objectives. How¬ 
ever, reducing the loss in the case of an event rather than reducing the probability of 
the event and the associated loss seems more plausible. Therefore, we view the location 
decision as equivalent to the purchase of self-insurance. Although market insurance is 
available in some areas, few consumers purchase it. Only 4 percent of the structures in 
Los Angeles are covered by earthquake insurance ( Scienrt , May 1976). 
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consumers account for this information in their decision making. In 
our theoretical development, consumers are assumed to be informed 
about relatively safe and unsafe locations. T'he information may be 
attained by visual inspection, word of mouth, or a government pro¬ 
gram that delineates relatively unsafe housing locations for home 
buyers. 'The empirical results in Section III not only support the 
contention that information is available and considered in home pur¬ 
chase decisions, they shed light on the source of the information. 

The consumer’s problem is to maximize expected utility over two 
states of the world: the earthquake state and no earthquake state, 
which occur with probabilities p and 1 — p, respectively. The con¬ 
sumer pays p(a, s) for a house where a = (a\ . a„) is a vector of n 

attributes and s is the safety attribute. Specifically, s is the monetary 
loss that the consumer perceives would be sustained during an earth- 
- quake. The function p(a. s) is assumed to be twice continuously differ¬ 
entiable in all arguments with first partial derivatives positive for i = 
1, . . . , n. This implies that the n attributes are all desirable; if, for 
instance, neighborhood crime is considered, the attribute is the ab¬ 
sence of crime. The partial derivative of the hedonic price equation 
with respect to the safety attribute is necessarily negative as shown 
below. 

Expected utility is written as 

V = P V[W(a) - p(a,s) - s] + (1 - p)f/[W(«) - p(a, s)], (1) 

where U has continuous first and second partial derivatives. The 
function W(a) is the wealth equivalent of the bundle of attributes the 
consumer has in the two states and is also assumed to be twice con¬ 
tinuously differentiable. The safety attribute (or the amount of self- 
insurance) appears in both states as a reduction in the price of the 
house but appears again in the earthquake state as a damage loss. 

The optimum choice of attributes is characterized by the following 
first-order conditions: 


a,: pU'AW, -/•>,) + (1 - p)U'(W, - p,) = 0, 

_ (1 - P)ft = 

'■ p(l+^) V ’ 


ra; 


( 2 ) 


(3) 


where subscripts on W and p denote partial derivatives and the e 
subscript on U denotes evaluation in the earthquake state. Assuming 
nonsatiation (U'„ U' > 0), condition (2) implies that the zth attribute is 
chosen where W, = p„ or its marginal value to the consumer equals its 
marginal cost in the market. Condition (3) indicates that at the op¬ 
timum the ratio of marginal utilities in the two states must equal the 
price ratio of self-insurance where the prices are weighted by the state 
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of the world probabilities.’' Note also that - 1 < p, < 0, or an addi¬ 
tional dollar spent on safety must decrease damages by more than a 
dollar. 

Assuming second-order conditions are satisfied, optimum values of 
a and s solve conditions (2) and (3). Either risk neutrality or risk 
aversion is compatible with second-order sufficient conditions. 4 

Equation (3) forms the basis for testing the expected utility model. 
That is, given values for the unknown parameters in equation (3) one 
can determine whether or not individuals act in accordance with ex¬ 
pected utility theory. The empirical analysis presented in the next 
section is directed at determining both the existence of a price gra¬ 
dient with respect to relative earthquake safety and the magnitude of 
any price differential (/>,). In addition, the source of this location 
information is examined. In Section IV the estimated price differen¬ 
tial is combined with probability and expected damage estimates to 
analyze the expected utility model. 

III. Empirical Analysis: Hedonic Housing 
Equations 

In the theoretical model it was hypothesized that individuals, acting 
on hazard information and possessing varying levels of risk aversion, 
would locate along a hedonic price gradient, with relatively safer 
homes commanding higher prices, everything else equal. In this sec¬ 
tion, a methodology that enables this hypothesis to be tested is de¬ 
scribed. Empirical tests are conducted for both Los Angeles County 
and the San Francisco Bay Area counties—Alameda, Contra Costa, 
and San Mateo. Also included is a description of the data base and the 
test results. 

Proximity to earthquake-related hazards is the important variable 
under study. Relatively hazardous areas have been delineated 
through research programs conducted by the U.S. Geological Survey 
and the California Division of Mines and Geology. I he outcome of 
these efforts was the Alquist-Priolo Special Studies Zone Act passed 
by the California legislature in 1972 and amended in 1974, 1975, and 
1976. This act represents an attempt to provide society with informa¬ 
tion concerning relative earthquake-associated risk. 

Special Studies Zones (SSZs) are designated areas of elevated rela¬ 
tive risk determined by potentially and recently active earthquake 
fault traces (surface displacement has occurred in Holocene time, i.e., 

5 See Ehrlich and Becker (1972) for graphical interpretations of a similar result. 

4 One of the sufficient conditions for a maximum is that V - p(U,p, + U,p„) - U 

- oKir* + U'p )< 0. This is satisfied if the marginal cost of safely is increasing /?„ > 
0 and if either IT = 0 or U" < 0 for risk neutrality or risk aversion, respectively. 
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over the last 11,000 years). The evidence of faults may be directly 
observable (ruptured streets, crooked fences, etc.) or inferred (i.e., 
geomorphic shapes). The length of an SSZ coincides with the fault 
length whereas the width is generally one-eighth of a mile on each 
side of the fault. 

Within California, the total number of SSZs designated through 
January 1979 was 251. There are two important ways in which con¬ 
sumers become aware of these. First, when an SSZ is designated, 
property owners in the zone are notified. Second, consumers selling 
property in an SSZ are required to notify prospective buyers that 
the property is in a zone (Alquist-Priolo Special Studies Zones Act 
1974). This latter requirement has been implemented by the Depart¬ 
ment of Real Estate by having agents disclose the information via an 
addendum to the purchase contract. The buyer is then granted a 
period to collect additional information or to cancel the sale. 

The potential effects of the Alquist-Priolo Act form the basis of a 
testable hypothesis. The null hypothesis is that consumers respond to 
the awareness of hazards associated with SSZs with the alternative 
being that they do not. 


Data Specifics 

The study areas are Los Angeles County and the San Francisco Bay 
Area counties, and observations are confined to single family resi¬ 
dences. Thus, we do not consider the impact of hazard location on 
other structures (multiple family dwellings, mobile homes, commer¬ 
cial, etc.) or other ownership types (rental, leasing, etc.). Therefore, 
within our sample, this research asks if Los Angeles and San Fran¬ 
cisco Bay Area households will pay a premium in the form of higher 
housing values for homes located outside an SSZ and what is the 
magnitude of that willingness to pay. 

The data base was constructed so that hypotheses concerning the 
impact of SSZ location differences on housing sale price could be 
tested. I he dependent variable in the entire analysis is the sale price 
of owner-occupied single family residences. 5 The independent vari¬ 
able set consists of variables that correspond to three levels of aggre¬ 
gation: house, neighborhood, and community. The Appendix de¬ 
scribes further the data employed in the study. 

The housing characteristic data, obtained from the Market Data 
Center (a computerized appraisal service centered in Los Angeles), 


1 Note that sate price or the discounted present value of the How of rents rather than 
actual rent is used as the dependent variable. The two are interchangeable given the 
appropriate discount rate. 
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pertain to houses sold in 1978 and contain information on nearly 
every important structural and/or quality attribute. The Appendix 
provides summary statistics for the housing, neighborhood, and com¬ 
munity characteristics used in the hedonic analysis. It should be em¬ 
phasized that housing data of such quality (e.g., micro level of detail) 
are rarely available for studies of this nature. Usually outdated data 
that are overly aggregate (for instance, census tract averages) are 
employed. These data yield functions relevant for the “census tract” 
household but are only marginally relevant at the household (micro) 
level. 

The Market Data Center provided computer data tapes listing all 
houses sold in Los Angeles County and the San Francisco Bay Area 
counties during the period specified. The number of entries was un¬ 
manageably large, so the data set was reduced as follows. First, a data 
set was constructed that contained houses within SSZs.® This was ac¬ 
complished by first searching the tape for all houses located in census 
tracts that were wholly or partly in an SSZ. This list was further 
reduced through a random number matching system. The addresses 
of the remaining entries were then checked against a detailed map to 
select those clearly within an SSZ. The numbers of valid Los Angeles 
County and the San Francisco Bay Area SSZ data points were 292 and 
745, respectively. 

Second, data sets were constructed that included houses not located 
in hazard areas. After deletion of incomplete data entries, a random 
number matching system was utilized to choose sample sizes of ap¬ 
proximately five thousand observations in each study area. The safety 
variable is then represented by a dummy variable that takes on the 
value one for houses in an SSZ and zero otherwise. 

In addition to the immediate characteristics of a home, other vari¬ 
ables that could significantly affect its sale price are those that reflect 
the condition of the neighborhood and community in which it is 
located. That is, school quality, ethnic composition, proximity to em¬ 
ployment centers (and in Los Angeles County, distance to the beach), 
and measures of the ambient air quality have a substantial effect on 
sale price. In order to capture these impacts and to isolate the inde¬ 
pendent influence of location vis-a-vis the SSZs, these variables were 
included in the econometric modeling. 

The data base assembled for the housing value study is appropriate 
to test the hypotheses outlined above for two reasons. First, the hous¬ 
ing characteristic data are extremely detailed at the household level of 
aggregation and extensive in that a relatively large number of obser¬ 
vations are considered. Second, a variety of neighborhood and com- 


See Han (1977) for the location of SSZs. 



TABLE I 


blSTIMA I £71 Hhl>ONl( fcQIMTIONS 

K)R Los Anueles Countv and Bay Area Counties 

Variables 

Los Angeles County 

Bay Area Counties 

Site-spec ihc characteristics: 



Sale date 

.002 

.008 


(17.92) 

(8.17) 

Age of home 

- .002 

.0005 

(-11.37) 

(2.37) 

Square feet of living area 

0003 

.00005 

(36.85) 

(14.85) 

Number of bathrooms 

.098 

260 


(11.58) 

(40.12) 

Number of fireplaces 

124 

.188 


(17.90) 

(27.86) 

Pool 

,093 

.067 


(8.66) 

(4.83) 

View 

.143 

128 


(11.56) 

(12.68) 

SSZ location 

- .056 

- .033 

Community c harac leristits: 

(-3.76) 

(-3.39) 

School quality 

.020 

.012 


(20.72) 

(12.85) 

Home density 

- 00004 

-.00002 


(-7.72) 

( - 14 15) 

Pet < cut blac k 

- 006 

- .006 

Percent gieatet than (i2 

(- 33 55) 

(-29.91) 

years old 

.003 

.009 


(6.35) 

(18.22) 

An pollution 

-.001 

- 004 

Location c harac teristic s. 

(-5.01) 

(-9.76) 

Distance to employment 

-2313 

- 401 


(-2.04) 

(-■17) 

Distance to beach 

- .016 
(-22 44) 

N.A. 

Alameda County 

N.A.* 

-.158 
(- 15.78) 

Conti a Costa County 

N.A. 

- 27 
(-21.20) 

Constant 

5.003 

5.335 


(60.59) 

(77 17) 

79 

.69 

Residual sum square 

281.02 

302.570 

Number of observations 

4,865 

5,4.38 

Noil —Dependent wlMlilt’ - ln(hoitic sale prke iri 1978 in 

*N \ * m>l applicable 

paicnihe^es 
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influence on housing values have been included 6 ^ ^ lomkm 


Empirical Results 


The underlying structure of the hypothesis test is , .• , 

empirical model that attempts to exobin s d sln g le -equation 

tae, located in Los AngSc <2 XT*," pri “ »' 

ties 1 The ecrimaied . u , y and the Sar > Francisco coun¬ 

ties. I he estimated coethcents of these hedonic equations specify the 

effect a change in a particular independent variable has on Le price. 
In reference to the SSZ location variable, this procedure allows one to 
focus on its significance while separating out the influence of other 
extraneous variables. Therefore, this analysis yields two outputs con¬ 
cerning the relationship of hazard location differentials to housing 
price. First, the relative significance of location variations is deter¬ 


mined and, second, the estimated coefficient pertaining to location 
implicitly measures its monetary value. 

I he estimated Los Angeles and San Francisco hedonic gradients 
that provide the best fit of the data are presented in table l . H A 
number of aspects of the equations are worth noting. First, as mea¬ 
sured by the nonlinear form is a significant improvement over 
linear specifications. In addition, a comparison of the log of the likeli¬ 
hood values (semilog to the linear) indicated that the semilog form 
was a significant improvement at the 1 percent level (see Judge et al. 
1980). As Rosen (1974) pointed out, this is to be expected since con¬ 
sumers cannot always arbitrage by dividing and repackaging bundles 
of housing attributes. Thus, on both theoretical and empirical 
grounds the semilog specification proved to be a better functional 
form. 

Second, in the semilog equations all coefficients have the expected 
sign and are significantly different from zero at the 1 percent level. 
The SSZ dichotomous location variable has the a priori expected rela¬ 
tionship to home sale price and is significant at the l percent level. 
This result is invariant with respect to various sample sizes, model 
formulations (various independent variable sets were tested), and es¬ 
timated functional form. 7 * 9 These results indicate that individuals are 


7 See Freeman (1979) and Malei (1977) for a review of estimates of hedonic housing 
equations. 

” The main difference between the Los Angeles and Bay Area analyses is the loca¬ 
tional variables. In the Bay Area distance to beach (ocean) is unimportant due to the 
presence of the bay. In addition, the three Bay Area counties were assigned dichoto¬ 
mous variables to account for county differences. San Mateo County is the excluded 
group and therefore is included in the constant term. 

9 Since the SSZ location variable is a zero-one variable then our choice set over 
functional fotms was essentially restricted to the linear and semilog forms. Thus, possi- 
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acting on hazard information when making locational choices, and 
this action is translated into a measurable hedonic gradient. 

Regarding the monetary impact on housing sale price, the non¬ 
linear specification does not allow straightforward interpretation 
since the effect of any independent variable depends on the level of 
all other variables. However, the Los Angeles County (Bay Area) 
results indicate that if all other variables are assigned their mean 
values, then living outside of an SSZ causes an increase in home value 
of approximately $4,650 ($2,490) over an identical home located in 
an SSZ. In relative terms the magnitude has approximately one-half 
the impact of a swimming pool or one-third the value of a view. 

In the next section, these monetary figures are used to test the 
expected utility model. But before proceeding to this analysis, we can 
confirm the source of the hazard information used by home buyers. 
As indicated above, the Aiquist-Priolo Act was enacted in 1974. 
Therefore, a pre-1974 analysis of the housing market would yield 
insight concerning the importance of the act in providing consumers 
relative risk information. 

Housing data for the 1972 time period are used in the test of the 
Aiquist-Priolo Act. Successful enhancement of consumers’ awareness 
by the Aiquist-Priolo disclosure provisions would require a change in 
the hedonic rent gradient over time. This change could take one of 
two forms: (i) an SSZ location would be an insignificant housing char¬ 
acteristic in 1972 yet significant in 1978; or (ii) the location variable 
would be significant in both years but its relative magnitude would 
increase over time. The first type of change could be considered a 
strong lest of the impact of the Aiquist-Priolo Act since the act would 
have filled an existing information void. Thus evidence of a direct 
market effect would be available. The magnitude change of the SSZ 
variable would imply a weaker response since it would be evident that 
consumers had hazard location information from some other source 
and were already acting on it before passage of the Aiquist-Priolo Act. 

The relative impact of hazard information independent of the Ai¬ 
quist-Priolo Act is also tested using the pre- and postdata sets; that is, 
if SSZ location remains a stable (no relative magnitude change), 
significant determinant of housing price, then consumers are acting 
on some available information although their preferences have not 
been enhanced or changed by the public disclosure program. 


ble forms such as quadratic, log, inverse semilog, exponential, semilog exponential, and 
the Box-Cox transformation of the SSZ location variable are not available since they 
inevitably reduce to zero-one or cannot be estimated (e g., log of zero). Further, a Box- 
Cox transformation of the dependent variable that is not equivalent to linear or semilog 
yields difficult to interpret results. Finally, the translog transformation is not available 
because the objective is to determine the separate influence of SSZ locations. 
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The 1972 time period results are presented in table 2. The semilog 
functional form provides the best fit of the data, and all coefficients, 
with the exception of SSZ location, are significant at the 1 percent 
level and related to home sale price as expected. However, the most 
noteworthy aspect of the equations is that the SSZ location variable 
does not demonstrate significance in 1972, even at the 10 percent 
level. The combined 1972 and 1978 results indicate that the Alquist- 
Priolo Act has caused a structural change in the hedonic gradient over 
time. This is evidenced both by the significant monetary impact 
change over time and by the change in significance. Therefore, in the 
study areas the Alquist-Priolo Act does pass a strong test of effec¬ 
tiveness, suggesting that the act provided information that consumers 
used in their market decisions. 


IV. Empirical Results: Expected Utility Model 

If consumers behave as if they maximize expected utility, then first- 
order condition ( 3 ) must necessarily be satisfied. 1 he terms in condi¬ 
tion ( 3 ) include the probability of an earthquake, marginal utilities ot 
income, marginal damage to a house, and the marginal change in the 
ZL price. Our approach .«... solve equation (3) for this alter term 
by substituting in reasonable values o. all the fc.rttter ternt. .,r the » 
Angeles region. This provides an analytical solution lor he put 
difference between houses in and out of SSZs. This price ddterence is 
hen compared to the observed difference in housing pnces estimated 
in the previous section. The two thllerences are shown be close, 

thereby supporting the expected utility patadigm. 

I ,The empirical w...k, houses were described as other .„ ot out 
In the empincu ( altribute wa s discrete. In equation (3) 

tarn an earthquake. Equauon (3) cat, the,, be rewrote,, as 


u: -p^ 

TF\\ - P [i - (V'JU')] 


< o. 


(4) 


The hedonic ho^ngeqttahon^de^e^^f-^O 

for an average house worth $83,1. -• 19?8 percent), this 

prevailing home mortgage ,n ^ S wou , d cost $442 more per year in 
implies a home outside of« - possible assumption is 

mortgage payments than one m an ^ ()f au ssZ to 

that this is the P^ceived am ‘ a borne turnover rate of 

home buyers whicl. may P However , if home buyers properly 
once every 3-4 years in 



TABLE 2 


Estimated Hedonic Equations eok Los Ancei.es County and Bay Area Counties 


Variables 

Los Angeles County 

Bay Area Counties 

Site-specific tharac tertstics 

Sale date 

.004 

.004 


(5.20) 

(6.96) 

Age of home 

- .005 

- .002 

(-19.52) 

(-15.17) 

Square lecl of living area 

.0005 

.0002 


(41 71) 

(47.42) 

Number of bath moms 

1.15 

.084 


(19 51) 

(15.35) 

Number of fireplaces 

.091 

.105 

(18 10) 

(20.52) 

Bool 

.151 

.105 


(14.75) 

(9.57) 

View 

150 

.080 


(10.56) 

(10.20) 

SS7. location 

.0002 

- .022 


( 0174) 

(-1.44) 

Community chat at tensties 

School quality 

.0098 

003 

(12.44) 

(7.34) 

1 lonie density 

,000017 

- .00001 


< - 5.88) 

(-8.83) 

Pei rent black 

- .0029 

- 002 


(-2264) 

(-15.147) 

I’enenl gieatri than 62 

years old 

002 

.004 


(4 85) 

(13.25) 

Air pollution 

- 0018 

- .004 


(-6.35) 

(-13 18) 

Lot ation c hai at temtic s 

Distant e m empltiyment 

-7 64 

-8.113 

(-8.40) 

(-4.74) 

Distance to beach 

- .0095 

N A * 

Alameda Ctmnlv 

<- 16 74) 

N A 

1.020 

Contra Cosla Countv 

N A 

(- 135.04) 

- 233 

Constant 

5.54 

(-25.34) 

6.126 


(82.05) 

(170.53) 


80 

.91 

Residual sum square 

169.44 

150 700 

Number ol obsetrations 

4,927 

5.460 


Note —l)cpt*ndciu \ .triable = InOumie sail* prue in 1*J75! $KK)^>; f-sidtism \ in jwrciiiliesc-i 
*.N A = not jpf ihcdblr 
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perceive the role of inflation and keep their homes for a longer pe¬ 
riod, then use of the real rate of interest would be more appropriate 
in calculating the true cost differential for living outside of an SSZ. 
From the early 1950s up until 1978 the real rate of interest on home 
mortgages averaged around 3 percent. If we use this rate of interest, 
we obtain a real cost differential of $140 per year. These figures 
provide a range for comparison to Ap from equation (4) after sub¬ 
stituting in values for p, Ai, and U' e /U'. 

First, consider a range of values for U'JU'. As a lower bound, and to 
be consistent with second-order maximization conditions, we use risk 
neutrality where U' r IU' = 1. For risk aversion, however, 1 < U' e IU' < 
00 . To establish an upper bound we appeal to recent work that em¬ 
ploys cross-sectional data on household assets to establish properties 
of household utility functions. In particular, Cohn et al. (1975) found 
evidence that the coefficient of relative risk aversion is slightly de¬ 
creasing in wealth. Friend and Blume (1975) found that “if there is 
any tendency for increasing or decreasing proportional risk aversion, 
the tendency is so slight that for many purposes the assumption of 
constant proportional risk aversion is not a bad first approximation” 
(p. 915). More recendy, Morin and Suarez (1983) found the coeffi¬ 
cient to be slightly decreasing for wealth levels up to $100,000, after 
which it becomes approximately constant. Furthermore, Friend and 
Blume estimated the market price of risk to determine a value for the 
coefficient, which they argue is greater than one and may be as high 
as two. Since we are interested in the ratio of marginal utilities and not 
the coefficient of relative risk aversion, we cannot use these results 
directly; but we can explore the implications suggested by this work. 

To determine an upper bound, one approach is to examine U F /U' 
for various utility functions that exhibit the properties cited above. 
The largest upper bound is associated with a utility function exhib¬ 
iting constant relative risk aversion equal to two; thus, we use U(A) = 

— A ', where A is total wealth. The denominator of U' F IU' is evalu¬ 
ated at total wealth, while the numerator is evaluated at total wealth 
minus the dollar value of earthquake damage. Again, to determine 
the largest upper bound, we assume the maximum expected damage 
of about $20,000 developed below. To obtain total wealth we note 
from Friend and Blume’s data (table 3, p. 908) that over their entire 
sample the market value of a house as a percentage of total wealth 
averaged 16 percent. 1,1 Since the average market value of houses in 


10 The use of 16 percent as the ratio of market value of houses to total wealth may 
seem small until one realizes that Friend and Blume (1975) define wealth to include 
human wealth. The authors regard this as the most appropriate definition: conse¬ 
quently, we use it here. 
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our sample is $83,153, we use as an estimate of total wealth A = 
$83,153/. 16 = $519,706, Finally, using U(A) = — A~ l , we obtain 
U'flU' = 5 = 1.08 for the largest upper bound. 

Another approach for estimating U' r IV is to use a linear approxi¬ 
mation (first-order Taylor series expansion) for describing changes in 
U'. Thus, we assume U'(A) as U'(A^) + l)"(Ao)(A — A 0 ), where the 
Taylor series expansion takes place around the level of wealth A () . 
Since the coefficient of relative risk aversion is defined as r = 

| Lf\Ao)A^U'(Ao) | we can then rewrite our approximation for U'(A) as 


U'(A) a U’{A (t ) ■ 



~ ^(i \] 11 

A n }_' 


If we let A„ equal the level of wealth before the earthquake and let A 
equal the level of wealth af ter the earthquake, dividing the expression 
above by £/'(A ( |) gives 



as an approximation of the ratio of marginal utilities in the two states 
of the world. This expression does not depend on use of a particular 
utility function, but rather will be a good approximation for utility 
functions that have small higher order terms for U'" and beyond. 
Using the highest estimated value for < of 2 and the highest estimate 
of damages of about $20,000 we obtain 


U' e _ . 0 /499,70b - 519,706 

V l 519,706 


1.08. 


This second approach gives an identical estimate to the first devel¬ 
oped above and suggests that risk aversion plays a surprisingly small 
role in our analysis apparently due to the relatively small changes in 
lifetime wealth involved. 

To estimate the odds of an event in the Los Angeles area, we use 
two sources. First, Kunreuther el al. (1978) report results of a survey 
question among California residents on the subjective beliefs concern¬ 
ing the odds of an earthquake. The average perceived odds of an 
event from that survey are about 2 percent per year. 1 * To obtain a 
more objective estimate of the risk of an event we turn to a report 


11 Note that U"AIU' will be a negative number for risk-averse individuals Thus, 
we replace V'AIU' by —c in developing this formula. 

* The average of tfie perceived odds used here was obtained from fig. 5.7 on p. 96 of 
Kunreuther et al. (1978) by taking the average of the end point risk of each risk 
category and multiplying by the reported f requency of occurrence. 
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issued by the Federal Emergency Management Agency (FEMA 
1980), which estimated the odds of a large earthquake to be from 2 
percent to 5 percent per year for the Los Angeles area. The upper 
bound of that range, 5 percent, resulted from scientific concerns over 
the Palmdale bulge, a temporary uplifting of the desert floor north of 
Los Angeles that occurred in the late 1970s. The lower bound esti¬ 
mate, which was widely publicized prior to the FEMA report, is based 
on the historical pattern of large earthquakes that have occurred in 
the Los Angeles area (Sieh 1978). For the relevant time period for our 
study, 1972—78, and for the Los Angeles area, there exists a remark¬ 
able coincidence between subjective and objective measures of risk of 
an earthquake. The FEMA lower bound estimate, which is appropri¬ 
ate prior to the occurrence of the Palmdale bulge, and the Kun- 
reuther et al. estimate both imply p = .02 for estimating Ap in equa¬ 
tion (4). 

Finally, we need to develop an estimate of earthquake losses or 
damages associated with residing in an SSZ as opposed to residing 
outside of an SSZ, defined as As in equation (4). Again, we can obtain 
a subjective estimate of about $20,000 from Kunreuther et al. (1978) 
for the average total damage people expect to occur to their homes it 
an earthquake occurs. 13 As an alternative measure, engineering stud¬ 
ies suggest that the average damage to a single-story frame house 
should a great earthquake occur near Los Angeles would be about 5 
percent of the home's value (NOAA 1973). This implies a level of 
damage for the average house in our property value sample (worth 
$83,153) of $4,158. However, homes in areas of maximum ground 
shaking, such as would occur in an SSZ if the local fault ruptured, 
would suffer damage equal to about 25 percent of the home’s value 
(NOAA 1973). For the average house in our sample, this implies 
damages of $20,788 (for a home in an area of maximum ground 
shaking). These figures obviously span the Kunreuther et al. estimate, 
with the upper bound figure quite close, suggesting that households 
answering the Kunreuther survey may have perceived the question to 
imply that their home would be located in an area ol maximum dam¬ 
age. Note, however, that As represents the difference in damages an 
individual would expect from living in versus outside of an SSZ 
should an earthquake occur. Thus, as an absolute upper bound, we 
will use a value of At of $20,000 consistent with a subjective assess¬ 
ment that homes outside of an SSZ will suffer no damage. As a lower 
bound we will take the difference in the objective engineering assess- 


13 Again, this average was obtained by weighting expected damage by trequency ot 
occurrence among the survey respondents From Fig. 5.6. p. 94, oi Kunreuthei et a!. 
(1978). 
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ments ($20,788 minus $4,158) of $16,630. Thus, the lower bound 
assumes homes in an SS7. will suffer the maximum level of ground 
shaking and homes outside an SSZ will suffer average levels of 
ground shaking. 

To obtain an upper bound estimate for the annual value of living 
outside of an SSZ to an expected-utiiity-maximizing household, we 
substitute values of U'JU' — 1.08, p = .02, and As = $20,000 into 
equation (4). These figures are consistent with the highest observed 
coefficient of relative risk aversion of 2 and the subjective evidence 
obtained by Kunreuther et al. on earthquake risk and damages. To 
obtain a lower bound estimate we assume risk neutrality so U'JU' = 1 
and use scientific-engineering evidence for p = .02 and A s — $16,630. 
These assumptions yield a range for A p of from $333 to $431 per 
year. In contrast, from the estimated property value equation, the 
perceived annual cost of living outside of an SSZ ranges from $ 140 to 
$440 depending on use of real or nominal interest rates. This evi¬ 
dence suggesLs that the estimated property value equation for Los 
Angeles is consistent with utility-maximizing behavior with respect to 
earthquake risks. 

V. Conclusion 

Schoernaker (1982, p. 552) summarizes the problems of expected 
utility theory as follows: "As a descriptive model seeking insight into 
how decisions are made, ELI [expected utility] theory fails on at least 
three counts. First, people do not structure problems as holistically 
and comprehensively as EU theory suggests. Second they do not pro¬ 
cess information, especially probabilities, according to the EU rule. 
Finally, EU theory, as an 'as if’ model, poorly predicts choice behavior 
in laboratory situations. Hence, it is doubtful that the EU theory 
should or could serve as a general descriptive model.” Our analysis 
provides only indirect evidence with respect to Schoemaker’s first 
point. However, having demonstrated consistency between our prop¬ 
erty value market results and the expected utility model for Los 
Angeles, we can strengthen the argument considerably by briefly con¬ 
sidering the San Francisco case. 

For San Francisco, home sale prices, damage to homes should an 
earthquake occur, and, presumably, risk preferences are ail similar to 
the Los Angeles case analyzed in the previous section. However, the 
probability of a damaging earthquake is considerably less according to 
available scientific evidence. For example, the FEMA report (1980, p. 
3) states: “the current estimated probability ... is smaller [than for 
Los Angeles] but significant,” and later gives annual odds for a great 
earthquake on the San Andreas fault near San Francisco as 1 percent. 
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These are half the odds given for a great earthquake in the Los 
Angeles area in the same report. Thus, from equation (4) of the 
previous section one would predict, on the basis of expected utility 
theory, that the property value differential for houses in SSZs in the 
Bay Area should be about half that observed in Los Angeles. From 
the two property value studies the differentials are $2,490 and 
$4,650, respectively. This successful “prediction” suggests both that 
individual households process probability information in a reasonably 
rational and accurate way and that, at least in a market situation with a 
well-defined institutional mechanism, the expected utility model may 
perform well in predicting behavior. It should be pointed out that 
through the decade of the 1970s, the media in California carried an 
average of two stories per week relating to local earthquake events, 
actual or possible damages, and probabilities (see, e.g., Los Angeles 
Times , April 7, 1975; April 4, 1976; April 22, 1978). Possible earth¬ 
quake events are a topic of considerable interest within the state, and 
the level of awareness among state residents is very high (Turner et al. 

1979) . The scientific evidence summarized in the 1980 FEMA study 
used in our calculations was widely publicized throughout the 1970s 
and may well be responsible for the similarity between the Kun- 
reuther et al. (1978) subjective probability estimates of earthquake 
risk and more objective scientific assessments. 

In summary, the property value studies make a strong case for self- 
insuring behavior consistent with maximization of expected utility. 
Further support of this result can be found by comparing the prop¬ 
erty value studies with surveys (Brookshire et al. 1982). In our survey 
of homeowners located in SSZs in Los Angeles (Brookshire et al. 

1980) , when asked how much more they would pay to purchase the 
same home outside of an SSZ, only 26 percent of respondents were 
willing to pay anything more. However, the average of all responses 
(including zero bids) was $5,920, very close to the average sale price 
differential of $4,650 from the Los Angeles property value study. 14 

Efficient prices should convey information to consumers. We have 
shown that the property value markets for both Los Angeles and San 
Francisco convey hedonic price differentials to consumers that corre¬ 
spond closely to expected earthquake damages for particular homes 
located in SSZs. Although the information provided by the SSZ pro¬ 
gram is by no means perfect, our results suggest that programs to 
provide consumers with hazard information may well be effective. 

14 Interestingly, when homeowners located outside of an SSZ were asked how much 
less expensive their house would have to be to get them to relocate in an SSZ, the 
average response was $28,250 (see Brookshire et al. 1980). This asymmetry between 
willingness to accept and willingness to pay measures of value has been demonstrated in 
a number of studies (see, e.g., Hovis, Coursey, and Schulze 1983). 



TABLE A1 

Variables Used in Analysis of Housing Market 


■o s 5? a 

C C C 2 S _ 

5 g 8 js x c 

!®®4 "I* 

cj 2 2 o £ 


ii ii ii 5 
: — © — '*■ 


% g « — 

lift. 


-c k- *•- *3 

■s| g S. 

&.I I* 

< S. J5 - 


- u ”5. w 

tfl U u c > 

fc » U 3 

If 

•5 Us £ 

2 J2 w o 
i5 -5 u ^ c 

- u C C 

S 5 O 

I t5 n 

C ftc_0 *3 - C/5 

--S M ^ 

- C C x C 
»i v n, boj! - 

fll U U — C L_ 

s££s - 


s t 

£ 3 S 

: sr| 

I 4 

§ “2 

i-5 S s- 

i S is 

O X 


c ®F 

n -5 •§ "S 

uk.: o 
<® Jft- 


14 1 

St'S C "3 5 

4o s c 3 ^ 

fjsiii c f| 
i2a. e2 .2 £3 
C 6 t g 8 « o -S <r 

* 1 B 2 |I n U 

IfiS^SiS a jp 


386 



Means and Standard Deviations (in Parentheses) for the Variables Used in the Hedonic Housing Equations 
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Distance to beach 12,41 11 48 

(7,69) (7,48) 

Number of observations 4,865 4 927 5,438 5,460 
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On Patents, R & D, and the Stock Market 
Rate of Return 


Ariel Pakes 

Hebrew University and National bureau of F.eonomic Research 


Empirical work on the causes and effects of inventive activity has had 
difficulty in finding measures that can indicate when and where 
changes in either inventive inputs or inventive output have oc¬ 
curred. The recent computerization of the U.S. Patent Office's data 
base may prove helpful in this context, but there is the problem that 
a priori we do not know the relationships between patent applica¬ 
tions and economically meaningful measures of these inputs and 
outputs. To help solve this problem, this paper investigates the dy¬ 
namic relationships among the number t)f successful patent applica¬ 
tions of firms, a measure of the firm's investment in inventive activity 
(its R Ik D expenditures), and an indicator of its inventive output (the 
stock market value of the firm). 


To date our understanding of the role of invention and innovation in 
economic processes has been severely hampered by a lack of empirical 
evidence about its causes and its effects. In large part this reflects the 
difficulty in finding (or constructing) meaningful measures of inven¬ 
tion. Early studies often used successful patent applications as their 
output measure (Schmookler and Brownlee 1962; Griliches and 
Sehmookler 1963; Scherer 1965a, 19656; Schmookler 1966). The pat¬ 
ent variable had the advantage of being a more direct consequence of 
inventive activity than the other indicators of performance available 
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(examples used include profits, productivity, and sales of new prod¬ 
ucts) and the advantage that patent applications were, at least in prin¬ 
ciple, available for an unusually long time period in an extremely 
detailed breakdown (by both grantee and product class; see U.S. De¬ 
partment of Commerce, Patent and Trademark Office, Office of 
Technology and Assessment [1973—79]). There were, however, two 
serious problems with the patent variable. First, though patent counts 
were available in principle, they were inaccessible in practice. Second, 
variation in the number of patents granted had no clear interpreta¬ 
tion. In particular, though it is clear that patent applications should be 
granted only when a useful and technologically feasible advance has 
been made (U.S. Department of Commerce, Patent and Trademark 
Office 1978) and that the patentee expects some positive benefit from 
the patent (since the process of application is costly in itself), it is also 
the case that technological, institutional, and market circumstances 
can cause patents to vary greatly in their economic value, and that not 
all useful innovations are patented. (For discussions of the usefulness 
of patent statistics see the exchange between Kuznets, Sanders, and 
Schmookler in Nelson [1962]; Comanor and Scherer 11969]; and 
more recently Taylor and Silberston [1973].) 

The recent computerization of the U.S. Patent Office’s data base 
has changed this situation. One can now obtain annual patent applica¬ 
tions in a variety of different breakdowns at reasonable cost (see, e g., 
Pakes and Griliches 1980). Thus the interpretative problem now takes 
on renewed importance. That is, in order to use the patent data to 
investigate hypotheses associated with the inducements to engage in 
inventive activity, the relationship between inventive inputs and in¬ 
ventive outputs, and the effects of those outputs, we require some 
understanding of the empirical relationships between patent applica¬ 
tions and the investments of patentees, and between those applica¬ 
tions and an economically meaningful measure of the value ot the 
inventive outputs the patentees have produced. 

This study provides an empirical characterization of the dynamic 
relationships among the number of successful patent applications of 
industrial firms, a measure of the firm’s investment in inventive activ¬ 
ity (its R & D expenditures), and an indicator of its inventive output 
(the stock market value of the firm). The use of stock market values as 
the output indicator has one major advantage in this context. As 
noted by Arrow (1962), the public-good characteristics of inventive 
output make it extremely difficult to market. Returns to innovation 
are earned mostly by embodying it in a tangible good or service that is 
then sold or traded for other information that can be so embodied 
(Wilson 1975; von Hippel 1982). There are therefore no direct mea¬ 
sures of the value of inventions, while indirect measures of current 
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benefits (such as profits or productivity) are Jikely to react to the 
output of the firm’s research laboratories only slowly and erratically 
(see the review by Griliches [1979)). On the other hand, under simpli¬ 
fying assumptions, changes in the slock market value of the firm 
should reflect (possibly with error) changes in the expected dis¬ 
counted present value of the firm’s entire uncertain net cash flow 
stream. Thus, if an event does occur that causes the market to 
reevaluate the accumulated output of the firm’s research laboratories, 
its full effect on stock market values ought to be recorded im¬ 
mediately. This full effect is, of course, the expected effect of the 
event on future net cash flows and need not be equal to the effect that 
actually materializes. The fact that we are measuring expectations 
rather than realizations, however, does have its advantages. In partic¬ 
ular, expectations ought to determine research demand, so that the 
use of stock market values should allow us to check whether the 
interpretation we give to our parameter estimates is consistent with 
the observed behavior of the research expenditure series. 

To obtain the implications of such considerations this paper uses a 
variant of Lucas and Prescott's (1971) investment model, together 
with a patent indicator function, to suggest restrictions on the stochas¬ 
tic process generating patents, R & D, and the stock market rate of 
return on the firm’s equity. These restrictions are embodied in a 
testable form by approximating both the patent indicator function 
and the function determining the value of the firm’s R & I) program. 
The resulting econometric model is a variant of the index (Sargent 
and Sitns 1977) or dynamic-factor-analysis (Geweke 1977) models 
that have recently been used to analyze macroeconomic data. The 
restrictions imply the existence of a particularly simple recursive sys¬ 
tem of equations that summarize and interpret the dynamic relation¬ 
ships among patents, R & D, and the stock market rale of return. 

This recursive form is estimated and tested on a micro data set that 
contains information on 120 firms over an 8-year period. The restric¬ 
tions seem to be consistent with the observed behavior of the data, 
and the paper f ocuses on the implications of the parameter estimates, 
particularly those associated with the interpretation of movements in 
the patent variable. These implications are investigated both in the 
cross-section dimension (i.e., differences in patent applications be¬ 
tween different firms) and in the time-series dimension (differences 
in the patent applications of a given firm over time). 

Section I sets out the framework for the empirical analysis; Section 
II provides estimates of the recursive form and the associated test 
statistics. In Section III the implications of the parameter estimates 
are considered in some detail. Brief concluding remarks dose the 
paper. 
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1. A Framework for the Empirical Analysis 

The econometric model to be investigated consists of equations for 
the stock market rate of return on the firm’s equity, the R & D expen¬ 
ditures of the firm, and the firm’s patent applications. The equations 
determining R & D expenditures and the stock market rate of return 
can be motivated by the assumptions that management chooses an 
R & D program to maximize the expected discounted value of the net 
cash flows (sales minus current input costs) from its activities and that 
the stock market measures this expectation subject to error. (Lucas 
and Prescott [1971] provide a more detailed discussion of similar 
assumptions.) The properties of the error term in the stock market 
equation are derived from an arbitrage condition that ensures that 
agents operating in the stock market cannot make excess returns 
from a simple linear trading rule and the information contained in 
the history of the R & D and stock market rate-of-return series. Pat¬ 
ent applications are taken to be an indicator of current and past 
values of the inputs and the market value of the outputs of the firm s 
R & D activity. This form of the patent equation reflects the lack ot 
prior information about the nature of the relationships between pat¬ 
ents and other variables, and a desire to obtain as general an empirical 
characterization of those relations as possible. I begin by outlining he 
derivation of the system of equations to be estimated, focusing on the 
interpretation of the parameters and the restrictions used to_ md,cate 
whether this interpretation is consistent with the observed beha^rot 
the data. (More detailed derivations can be found in Pakes [19 1J.) 

Assume that management chooses a research 
of random variables determining current and future ‘ 5 P di . 

ditures conditional on the information available when those expend! 

mre! must be made) Co tnaxintite .he;D in 
the net cash flows from the firm’s activities, and that non-R & D 
nuts can be adjusted costlessly at the beginning of each period to 
maximize the profits attainable in that period. Management s eva - 
turn of a given program is found by substituting 

nei cash Bow functions, taking the expectation of * 

“u„«5 value of future net cash flows plus 

the current informatmn set. «. »j“ to 

ditures of the firm («*• * or s \ . nrovides infor- 

program can be written as 

V(Cl„ R<) = H(R„ R,-u Ri-a. At)' (1) 
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where //(•) provides the expected discounted value of future net cash 
flows and current profits conditional on current information, and A, 
summarizes the effect of other variables that are known to manage¬ 
ment at the time input decisions are made, but that are not in the 
econometrician’s data set. 

Clearly, for a program to be optimal it must maximize V(il„ R t ) with 
respect to R,. That is, if R, is optimal and V*(Q,) is management’s 
evaluation of the firm conditional on optimal behavior, then 

V*(n,) = max V(n„ R,) = H(R„ .... A,) - R,. (2) 

R, 


Note that equation (2) implies that an assumption on the functional 
form of H(-) and on the stochastic process generating {A,} will suffice 
to determine the bivariate stochastic process generating the value of 
the firm’s R & D program and R & D itself. This implication is used in 
the empirical analysis. 1 2 

If the stock market provided an exact evaluation of the expected 
discounted value of the firm’s future net cash flows conditional on the 
same information used by management, then the 1-period excess rate 
of return on the firm’s equity (capital gains plus dividends on $1.00 
invested in the firm minus the interest rate) would equal the percent¬ 
age increase in the expected discounted value of these net cash flows 
caused by the information that accumulates over the given period; 
that is, it would equal qf where' 

* _ v? - E(vm>~i) 

q ‘ Vt 


(3) 


We shall allow for a disturbance in the relationship between the ob¬ 
served 1-period rate of return, say q h and qf, that is, 

q, = qf + T),.„ (4) 


1 Equation (2) tollows from the Bellman condition for this problem, and the possibil¬ 
ity of using il to structure the empirical relationship bciween investment and the value 
of the firm is noted by Lucas and Prescott (1971) (see also Sargent 1978. 1979). This 
procedure does not provide direct evidence about the nature of the relationship be¬ 
tween R & D and net cash flows (a topic of considerable controversy; compare, e g , 
Grilic hes [1979], in which a distributed lag of R & I) is used to construct a knowledge 
slock that enters into a production function for marketable goods and services, to 
Nelson and Winter [1982] or Telser [1982], in which the distribution of outcomes from 
a search process it affected by the quantity of resources invested in researt h) Our focus 
here, however, is on the relationships among the value of the firm itself, R & D, and 
patents; for this the Bellman condition suffices. 

2 This is a discrete-time approximation to a continuous-time result. It assumes that 
dividends are declared at the beginning of the period and ignores terms equal to the 
within-period interest earned on dividends per share and the within-period interest on 
capital gains share (see Pakes 1981). A correction for this omission did not change 
the empirical results. 
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but shall assume that this disturbance is uncorrelated with informa¬ 
tion that is publicly available at the beginning of the period—in par¬ 
ticular, with the history of the R & D and rate-of-return series. This 
arbitrage condition ensures that the process generating Tp,, does not 
allow agents operating on the stock market to use publicly available 
information and a simple linear trading rule to make excess returns 
on that market, and therefore is consistent both with several previous 
empirical studies (see Fama 1970; LeRoy and Porter 1981) and with 
the observed behavior of our data (see below).* Since one can ensure 
that cov(rp, f , qf) = 0 by a normalization that affects only the relative 
values of coefficients, and therefore does not affect the interpretation 
of the parameter estimates, we shall also assume this condition in what 
follows. 

The third equation of the model is the indicator function for patent 
applications. Note that, given current and past R & D, equation (2) 
implies that the value of the firm’s R 8c D program is determined 
solely by A,. To make patents (P,) an error-ridden indicator of cur¬ 
rent and past values of the inputs and the outputs from the firm's 
R & D activity then, it suffices to specify that 

P, = P(A„A t - ,_ _ __ G,), (5) 

where the disturbance process {G,} sets the propensity to patent, that 
is, determines the number of patents applied for given the history of 
the inputs and the market value of the outputs from the firm’s R & D 
activity. The phrase “the propensity to patent” is taken from Scherer 
(1965a, 1965A), who uses it to refer to differences in the number of 
patents resulting from an innovation of a given quality. We will as¬ 
sume the process generating that propensity, {G,}, to be independent 
of the process generating R & D and the value of the firm. I hese 
assumptions provide a precise interpretation for the propensity to 
patent that will be shown to lead to testable implications below. 1 


** Note that the presence of the error term. t|i.o implies that thcie may he more 
variance in slock market evaluations than can be justihcd by the yaiiance in earnings 
(which accords with the results ol LeRoy and l’orler [1981] and Shiller (1981]). 

1 Note that eqq (2) and (5) assume that there is only one sequence ol random 
variables, (A,), which, given current and past H. determines both the value ol the K Sc D 
program (V*(n,)] and, apart front differences in the propensity to patent, patents per 
se. It is possible to construct a richer model that identifies two factors, one affecting 
patents only through the R Sc D expenditures tt induces (say demand shocks) and one 
having a direct effect on patents and an indirect effect via induced R & D demand (say 
technological ot supply shocks). Foi an interesting discussion ol the implications of Die 
differences between demand and supply shocks see Schmooklcr (I9M») and Rosenberg 
(1974). Since, however, the empirical results indicated that to distinguish between 
demand and supply shocks one requires more (and quite likely different) data than are 
used here (see Pakes 1981). and since eq. (5) suffices for the leduced-lorm interpreta¬ 
tion of movements in the patent variable we are after, I shall concentrate on the simple! 
model, which uses eq. (5), here. 
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Equations (2), (4), and (5) suggest easily interpretable restrictions 
on the stochastic process generating q, R , and P. To derive an explicit 
form for those restrictions, 1 use a logarithmic approximation to H(-) 
in equation (1) and to P(-) in (5); assume that {a, = log A„ g, = log G t , 
tj j,} evolves as a covariance stationary stochastic process and use its 
moving average (or Wold) representation (see Anderson 1971, sec. 
7.6); and solve equation (2) for r, = log R„ (4) for q h and (5) for p, = 
log P,. The stochastic process generating {q„ r,, p,} can then be written 
explicitly as 

f /< = G + tji .0 

r, = X (6) 

T^O 

nc. rd 

Pi — ^ C\ T € t T + ^ 6s.tTI3.j_ti 

T — 0 T-0 

where {e,}, {rji ,}, and {tj ;1 ,} are three mutually uncorrelated white noise 
processes (i.e., processes that are serially uncorrelated with constant 
variance), a, = )a T e/_ T ,gv = and 63.0 = 1. Equation 

( 6 ) decomposes the variance in each of the observable deviates (in q„ 
r„ and p t ) into portions resulting from current and past values of three 
innovations (i.e., unpredictable random variables). That is, t, is the 
innovation in a„ is the innovation in g„ and, due to our arbitrage 
condition, tju is an innovation in itself. The three innovations are 
uncorrelated with past values of all variables and are mutually uncoc- 
related as a result of the assumed independence of (1, front R, and q, 
and of the definition of a,. ' 

For an intuitive understanding of the system in ( 6 ), note first that it 
is realizations of e, (the process determining a,, or the value of the 
research program) that cause changes in r,. Now suppose that an 
unexpected research-related event occurred during the previous time 
period that increased the market value of the firm by 1 percent (i.e., e 
= 1 ). The returns on holding the firm’s equity over that period will, 
as a result, be 1 percent above the market rate of return. This same 
event will also cause changes in the firm’s R 8c D program and in its 
patent applications. Current R & D expenditures will go up by c 2 ,o 
percent above what would have been predicted for them at t — 1 (past 
e’s can be determined from past r’s), while expected R & D expendi¬ 
tures t periods ahead will go up by e 2 , T percent. Similarly patent appli- 


5 The system in (6) ignores any deterministic components in the stochastic process 
generating (<j„ r„ p,}. The empirical work adds time dummy variables to all equations, 
and these should pick up any deterministic components that exist. 



PATENTS 


397 


cations t periods ahead will go up by c 3 T percent. A realization on tp 
equal to, say, X is noise in the sense that it never (either currently or in 
the future) affects p or r, while a realization of 1)3 = X will never affect 
either research expenditures or the value of the firm and in this sense 
can be interpreted as a change in the propensity to patent given the 
inputs and the outputs of the firm’s R & D activities. 

II. Test Statistics and Parameter Estimates 

Formally the econometric model given by equation (6) is a restricted 
version of a dynamic-factor-analysis (Geweke 1977) or an unobserv¬ 
able index (Sargent and Sims 1977) model. The name is a result of the 
fact that in (6) there is a single stochastic process, built up from the e, 
that accounts for all the observed correlations between current and 
past values of the components of y, = (</„ r„ p,). This provides the 
empirical interpretation to realizations of and ip! stems from 
differences in patenting that are never associated with differences in 
the value of the firm or in the firm’s R 8c D program; and tp steins 
from movements in the stock market value of the firm that are never 
associated with its R & D program or its patents. The model in equa¬ 
tion ( 6 ) is more restricted than the general index model. In particular, 
it constrains q, to be a function of only current values of e and -rj 1 - 
Since the history of e and Tp can be predicted from the history ofy, the 
implication this constraint is testing is that realizations of q, cannot be 
predicted f rom the history of the variables in our data set. In addition 
the system in ( 6 ) does not allow a separate stochastic process that 
affects r but does not affect p or q (all the variance in r is accounted for 
by current and past values of e, or there is no measurement error in 
r). This assumption was maintained because the empirical results indi¬ 
cated that there was no need to allow for such a measurement error.*’ 
The restrictions embodied in ( 6 ) allow for relatively straightfor¬ 
ward estimation and testing procedures. This results from the fact 
that the system in (6) has a recursive form, in which all restrictions are 
exclusion restrictions, and which, by its recursive nature, permits 
equation-by-equation estimation techniques. This recursive form has 
q, as a function of the history of y,, r, as a function of q, and the history 
of y t , and p, as a function of q„ r„ and the history of y,. We now provide 
and estimate each of the equations of this recursive form. 


11 See Fakes (1981). This finding is comforting in a slightly different context, since it 
indicates that once one moves away from measuring the effects of R 8c D via us impact 
on indirect measures of current benefits, there is less need to worry about measurement 
error in the R & D series (see Griliches [1979] for the importance of measurement 
error in studies designed to measure the contribution of R Sc D to productivity). 
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The data used here contain the successful patent applications, the 
R Sc D expenditures, and the annual rates of return on the stocks of 
120 firms over an 8-year period (1968-75). The sample of firms and 
the method of constructing the patent and R & I) variables are dis¬ 
cussed in Pakes and Griliches (1984). The observations on the stock 
market rates of return were taken from the 1975 Master File of the 
University of Chicago’s Center for Research in Security Prices 
(CRSP). I use the rates of return in the year before the R & D expendi¬ 
tures and patent applications were made. This is a result of the as¬ 
sumption that decisions on r and p are made at the beginning of the 
year; as we shall see below, this assumption is supported by the data. 

The leading equation of the recursive form has if as a function of 
lagged values of itself, r, and p. The model predicts that neither 
component of q (e or tp) can be predicted by a linear combination of 
these variables, or that agents cannot make excess returns on the stock 
market from a linear trading rule based on the history of y,. fable 1 
presents test statistics for this hypothesis. Column 1 shows that it is 
reasonable to assume that q, cannot be predicted from past values of 
itself, column 2 that it cannot be predicted front past values of r or p, 
and column 3 that it cannot be predicted from past values of itself , r, 
or p. Thus rates of return do seem to represent unpredictable move¬ 
ments in the value of the firm, or at least movements that cannot be 
predicted with the variables in our data set. 

To obtain the recursive form of the r, equation, first note that e, can 
be written as 

e, = Qq, + v„ (7) 

where 0 - that is, 0 is the signai-to-total-variance ratio in q, and 

v, = (1 — 0)c, — 0T)i.(. It follows that v, is uncorrelated with q, and with 
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past values of all variables. Next, assuming that there is an autoregres¬ 
sive representation for the r, equation, we obtain it as r, = c 2 ,o£/ + 
d 2 (L)r t -i, where, here and in the discussion below, a function of L 
represents a polynomial in the lag operator and c 2 (L) = c 2 .o[l - 
d 2 (L)] ~ ‘. 7 Substituting (7) into the autoregressive form of the r, equa¬ 
tion, we obtain 

r, = r 2 .o 0 (y, + d 2 (I.)r t _ , + c 2 () v,. (8) 

Note that the variance of the disturbance in equation (8) is cr^ci i0 ( 1 
— 0)0, so that (together with the first coefficient and <r^) it can be 
used to identify 0 and therefore c 2 (l . 

Equation (8) is reminiscent of Grunfeld’s (1960) investment equa¬ 
tion. Grunfeld used stock market evaluations to proxy for the effect 
of unobservable expectations on investment (Lucas and Prescott 
[1971] provide a more rigorous justification for this procedure). In 
equation (8) revisions in stock market evaluations (i.e., q,) are used to 
proxy for the effect of factors that caused revisions in the expected 
discounted value of the firm’s R & D program. This allows us to iden¬ 
tify the time pattern of the relationship among changes in the market 
value of the firm’s R 8c I) program, patents, and R & D itself. Note 
also that since v t is uncorrelated with q, and with past values of all 
variables, equation (8) implies that in a regression of r, on q, and 
lagged values of all variables (which, recall, is the second equation of 
the recursive form), all the coefficients but those on current q and 
lagged r should be dose to zero. 

The recursive form of the p, equation is obtained by multiplying the 
last equation in the system in (6) through by = 1 - il*(L) and 

making the substitution c, = c>(/-) 1 r,. I his implies that 

p, = y(L)r, + di(L)p,- X + t) :4 .„ (9) 

where y(L) = c^L)c 2 (L)~ '[1 - d- s (L)). Since ty,., is uncorrelated with 
current q and r and past values of all variables, the model implies that 
in a regression of pi on q t , r,, and lagged values of all variables (wfiich is 
the last equation of the recursive form), all the q coefficients should be 
close to zero. 

Table 2 presents the results. The unrestricted autoregressive forms 
of these equations (the form that has r and p as a function of only 
lagged values of all variables) have been presented for comparison, 
while the relevant test statistics are presented at the bottom of the 


7 That is, d t (L) = 4', where l\ = x, , I assume that the rootsotThe polyno¬ 

mial equations associated with c 2 {l.) and &,(/.) all lie outside the unit c ircle. This ensures 
the existence of a convergent autoregressive representation lor the i, and p, equations 
(see Anderson 1971, set. 5.7). 
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table. Beginning with the R & D equation (col. 1) one finds two rather 
striking implications of the estimates. First, the events leading the 
market to reevaluate the firm are indeed highly and positively cor¬ 
related with the events leading the firm to change its R & D policy 
from what would have been predicted given the firm’s observable 
history (i.e., the history of y,). There is really no doubt on this point, as 
the coefficient of q t is large and estimated with great precision. Equally 
striking is the fact that we can be quite sure that each of the 
coefficients of the lagged p variables in this equation is very close to 
zero (once again all of the estimates are near zero and their standard 
errors are small; see also test T 2 of this col.). Thus once we account for 
'he influence of past r and current and past q, the additional informa¬ 
tion in movements in past p is information that never affects R & D 
expenditures. This is confirmation of our interpretation of the 113 ,/ 
process as differences in the propensity to patent for a given history 
of the firm’s R & D program, since changes in it do not affect r. 

The only implication of the model, then, that is not strongly sup¬ 
ported by the estimates of column 1 is the zero restriction on the 
lagged q coefficients. The relevant test statistic here is 7^ of column 2, 
which is significant at the 5 percent but not the 1 percent level. Addi¬ 
tional results, which will not be discussed here, indicated that we 
observe marginally significant lagged q coefficients because the as¬ 
sumption that the process generating r, has a low-order autoregres¬ 
sive representation is questionable. Since this is a technical problem 
and since correcting for it did not change any of the basic implications 
of the parameter estimates, we shall ignore it below and accept the 
column 3 estimates for the r, equation . 8 

The parameter estimates from the patent equation make it clear 
that current and past changes in R 8c D (past changes only in col. 5) 
have a significant effect on changes in current patent applications 
(test T\). Though this was perhaps to be expected (see Pakes and 
Griliches 1980), what is more surprising is that once the effect of 
R & D expenditures on patent applications is taken care of, other 
factors leading to a change in the market’s evaluation of the firm are 
not correlated with patent applications (lest T s ). In particular, all the q 
coefficients in the p equation are near zero, and this leads us to accept 
the interpretation of the error in the regression of p, on the r,_ T as 
differences in the propensity to patent, given the market value of the 
output of the firm’s current and past research expenditures. 

An omnibus test of the model’s restrictions can be obtained by 
comparing the likelihoods of the restricted and the unrestricted re¬ 
cursive system of equations. The observed value of the xts/25 likeli- 


More details on these points can be found in Pakes (1981). 
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hood ratio test statistic for the null hypothesis embodied in the mod¬ 
el’s assumptions was 1.15, which is not too different from the 
expected value of xi$J25 deviate (0.97) and certainly not surprising 
(the 5 percent critical test value is 1.51). Since the assumptions of the 
model seem to be consistent with the observed behavior of the data,' 1 
we now go on to explore the implications of the parameter estimates 
in greater detail. 

III. Some Implications of the Parameter 
Estimates 

I begin with the implications of the estimates lor the interpretation of 
movements in g and r. Noting that cr* = 0.10 and using the parame¬ 
ters of the R & D equation, we find a 0 (ajj/tr *) of 0.05. That is, about 5 
percent of the within-period variance in the rate of return is caused 
by events that also cause changes in both R & I) expenditures and 
patent applications. 10 A 0 of 0.05 implies that c ?,« (— dr,/de. r ) = 2.60. 
This implies that a 1 percent increase in R & D expenditures above 
what would have been predicted from past information is associated 
with events that have caused an increase in the value of the firm of 
0.39 percent. Evaluating derivatives at the means of all variables, we 
find that a $100 unexpected increase in R & D is associated with re¬ 
search and patent-related events that have increased the value of the 
firm by $1,870." Recall that the results implied that there was no 
need to allow for measurement error in R & D (see Sec. II), so that all 
unpredictable changes in R 8c D have this interpretation. The unex¬ 
pected increase in patents is + t) v , where, from the estimates, 

fs.o = 1.56. Thus, events that lead to a unit increase in e result in a 
1.56 percent increase in successful patent applications. Much of the 
variance in the unexpected change in the patent variable (about 94 


,J To ensure the robustness ot this conclusion with respect to the statistical assump¬ 
tions, the tests ot the recursive form were also run, using first differences (instead rtf 
levels) of the r and p series, using weighted r and p series where the weight for a given 
firm was the square root of the mean R & D expenditures of that firm over the sample 
period, and allowing the coefficients of the recursive lorm to differ in the different 
years of the sample. None of the resulting lest statistics indicated rejection of the 
model's assumptions There was, however, an indication that some of the coefficients in 
the recursive form were not stable over time, though the economic implication* of the 
intertemporal differences in these coefficients were minor. 

10 The firms in our sample are all rather large (the average value of their common 
shares is $1,514 million) and diversified, and they do a great deal of research. 

11 The means reported here are sample means; i.e,, they are calculated over all 
observations (N firms and T years) and thus require the use of price deflators. The CPI 
was used to deflate stock market values, and the R & D deflator discussed in Pakes and 
Griliches (1984) was used for R & D expenditures. The base year for these deflators is 
1972, so all dollar figures in the text are in 1972 dollars. 
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percent) is noise, so that we hnd that a 1 percent increase in patents 
will, again on average, reflect only a 0.044 percent increase in the 
market value of the firm; alternatively, one additional patent indicates 
that events have occurred that increase the firm’s market value by 
$810,000. The estimates imply, then, that although unexpected 
changes in patents are a very noisy indicator of unexpected changes 
in the market value of the firm’s R & D program, on average, an 
increase of one patent is associated with large changes in market 
value. 

Figure 1 presents the estimates of the distributed lags from e to r 
(labeled c 2 [t]) and from t to p (osM); while figure 2 presents the 
distributed lags from r (op (y*(r], where 7 *[t] = cs[t]c 2 M~ l ) and from 
t|*j to p Figure 1 makes it clear that the events that change the 

market value of a firm’s research program have a persistent effect on 
both patents and R & D expenditures. As a result interfirm differ¬ 
ences in R 8c D expenditures are quite stable over time, and if we are 
seeking their causes we should look for factors in the firm’s environ¬ 
ment whose effects are likely to persist. On the other hand, the small 
changes that do occur in the firm’s R & D expenditures are almost 
entirely determined by recent events. Thus events that occurred over 
3 years earlier will have essentially the same effect on r, as on r,_ i and 
cannot cause differences between them. The estimate of cs(t) is simi¬ 
lar to that of c 2 (t), except that the effect of the « on p tends to increase 
before declining, giving the impression that p reacts to the € a little 
more slowly than r does. Thus, moving to figure 2, we see that patent 
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applications follow tlie factors determining the productivity of cur¬ 
rent R & L) expenditures (and hence R & D demand) quite closely. 

The sum of the coefficients in the distributed lag from r lo p is 1.18, 
implying that the events leading to a 1 percent increase in R & D 
expenditures will, eventually, lead to a 1.18 percent increase in pat¬ 
ented innovations. About 50 percent of these patents will be applied 
for itt the same year as the R & D expenditures are incurred, while 70 
percent will be applied for within 3 years. In fact, il from c^(t) one gets 
the impression that events that cause unexpected changes in the value 
of a firm’s R & D program start a chain reaction leading to more 
R & l) expenditures far into the future, then y*(-r) seems to be de¬ 
scribing a situation where firms patent around the links of this chain 
almost as quickly as they are completed. There is also a long, slim tail 
of the distributed lag from r to p, which probably represents the ef fect 

of the basic research done in the past on current patented innova- 

1 l) 

turns. 


The reader is raulioned not lo interpret the distributed lag from R & D to patents as 
representing a prod tit lion-type relationship between past R & D and patentable out¬ 
put. The estimates presented here do not distinguish the dirett effect of R Sc D on 
patents from the effect of changes in the value of the firm’s R Sc 1) program (in a,) on 
R 8c D and patents (this is the dynamic analogue of the classical simultaneous equations 
problem discussed in Marschak and Andrews (1944)). The estimate of y*(L) is similar 
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The estimates of g(t) indicate that interfirm differences in the pro¬ 
pensity to patent are not as stable over time as one might have ex¬ 
pected. Thus, recalling that g, is the propensity to patent (g, = X”_«, 
^s.r'ns.t—r) we find that the correlation of g, and g,- T is only about .75 for 
t = 1, going down to around .6 for t = 2, 3, and 4, and decaying at a 
fairly constant rate of .9 thereafter. 

A question of general interest is: How closely related to differences 
in the outputs and the inputs of the firm’s inventive activity are mea¬ 
sures based on the recently computerized U.S. Patent Office’s data 
base likely to be? The data suggest that some differences in patent 
applications are close approximations to differences in these vari¬ 
ables, while others are not. 

First, consider constructing a cross section of patent applications by 
firm in order to study the causes of interfirm differences in inventive 
output (or their effects). The estimates indicate that 76 percent of the 
interfirm variance in patents is caused by the e„ that is, by research- 
related events that cause changes in the market value of the firm, 
while the remainder is noise (not related to either the firm’s research 
program or its value). If one were to ask what proportion of the 
variance in p, is caused by the events determining current research 
demand, the answer would be a little less, but not much. To see 
this, consider the projection of p, onto r„ that is, p, = <t>r, -I- g',, where 
cov(g,', r,) = 0. 1! Appropriate calculations indicate that 4> — 1.12, 
while var((f>r,)/var(p,) = 0.74. A 1 percent difference in R, will, there¬ 
fore, be associated with a 1.12 percent difference in patent applica¬ 
tions, while about 74 percent of the interfirm variance in p, can be 
attributed to interfirm variance in r t . Inverting these calculations one 
finds that, on average, a l percent dif ference in current patent appli¬ 
cations is associated with factors that have led to a 0.66 percent differ¬ 
ence in /€,; 14 this implies that (on evaluating derivatives at the sample 
means of all variables), a difference of one patent is associated with 
events that, on average, are associated with a $304,000 difference in 
current R & D activity. 

Unfortunately, intrafirm differences in patent applications do not 
seem to be as good an indicator of intrafirm differences in inventive 
output as interfirm differences. The proportion of the variance in p, t 
- p,t~] caused by the e is about 8 percent, with 45 percent of this 8 
percent caused by research-related and patent-related events that 
changed the market value of the firm in the given period (by e ( ). 


to what Zvi Griliches and I, in joint preliminary work, suggest as a likely form tor this 
lag structure (Pakes and Griliches 1980). 

1 Here <j> = Xr„(T)-y*{T)/f„(0), where c„(t) = cov(r„ r,_J and y*(r) is the rth lag 
coefficient in the distributed lag from r to p. 

M That is, r, = ^>'p, + g", where cov(g", p,) = 0, and <V = 0.66. 
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These ratios do, however, increase significantly when one takes 
intrafirm differences in patent applications that are farther apart. 
The proportion of the variance in p, t - p„~ 5 caused by the e is 15 
percent, with over 75 percent caused by events that occurred during 
the 5-year period. For 10-year differences the figures move to over 20 
and 85 percent, respectively. Thus if one were to use intrafirm differ¬ 
ences in patent applications to study the effect of changes in a firm’s 
inventive output on, say, its investment policy or its share of a given 
market, then one ought, probably, to stick to longer-term changes in 
alt variables. 

IV. Concluding Remarks 

Empirical work on the causes and the effects of inventive activity has 
had difficulty in finding variables that can indicate when and where 
changes have occurred either in the inducements to invest in inven¬ 
tive activity or in inventive output. The recent computerization of the 
U.S. Patent Office’s data base may provide some help in this context, 
but there is the problem that a priori one does not know the relation¬ 
ship between successful patent applications and economically mean¬ 
ingful measures of these inputs and outputs. To provide a partial 
answer to this question, this paper investigated the relationship be¬ 
tween successful patent applications, a measure of the inputs into the 
inventive process (R & D expenditures), and a variable that provides a 
measure of, among other diverse factors, the value of the output from 
this process (movements in the stock market value of the firm’s 
equity). The assumptions that management chooses an R & D pro¬ 
gram to maximize the expected discounted value of the net cash Hows 
from the firm’s activities, that the stock market measures this expecta¬ 
tion subject to error, and that patents are an error-ridden measure of 
current and past values of the inputs to and the outputs from the 
firm’s R & D activity were used to suggest a testable interpretation of 
the dynamic relationships among the three observable variables. This 
interpretation seemed consistent with the observed behavior of the 
data, and the qualitative nature of the empirical results can be sum¬ 
marized quite succinctly. 

First, it is clear that the events that lead the market to reevaluate the 
firm are indeed significantly correlated with unpredictable changes in 
both the R & D and the patents of the firm. Moreover, the estimates 
imply that, on average, unexpected changes in patents and in R & D 
are associated with quite large changes in the market value of the 
firm. Nevertheless, there is a large variance to the increases in the 
value of the firm that are associated with a given increase in its pat¬ 
ents, This may reflect an extremely dispersed distribution of the 
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values of patented ideas. Further, most of the variance in the stock 
market rate of return has little to do with the firm’s inventive en¬ 
deavors, at least as measured by its R & D input and its patent output. 
However, once appropriate disturbances are allowed for, the observa¬ 
tions on the stock market rate of return do seem to enable us to 
separate out the time pattern of the impacts of events that cause 
changes in the value of a firm’s R & D program (movements in the 
stock market rate of return do seem to be a result of unpredictable 
events, and stock market evaluations should not depend on the long 
and erratic lag structure between invention and the current benefits 
derived from it). 

The events that do cause the market to reevaluate the firm’s inven¬ 
tive endeavors have long-lasting effects on both the patents and 
R & D expenditures of the firm. On the other hand, the effects of the 
factors that cause differences in the propensity to patent are much 
more transient. These timing patterns have several implications. The 
large differences in the patent applications of different firms are 
mostly associated with differences in the market's evaluations of dif¬ 
ferences in the firms’ inventive output. However, the smaller differ¬ 
ences that occur in the patent applications of a given firm over time 
are due largely to differences in the propensity to patent. Of course, 
some information is still in the time-series dimension. If we were to 
observe, for example, a sudden burst in the patent applications of a 
given firm, we could be quite sure that events have occurred to cause a 
large change in the market value of its R & D program; but smaller 
changes in the patent applications of a given firm are not likely to be 
very informative. This latter statement must be modified somewhat 
when we consider long-term differences in the patents of a given firm 
(say differences over a 5- or 10-year interval), as a larger portion of 
their variance is caused by events that lead the market to reevaluate 
the firm's inventive output during these periods. 

The timing of the impact of the events that cause unexpected 
changes in the market value of a firm’s inventive activity on patents is 
very close to the timing of their impact on R & D. In fact one gets the 
impression from the estimates that an event that causes a 1 percent 
change in the market value of a firm’s inventive activity starts a chain 
reaction leading to more R & D expenditures far into the future, with 
the firm patenting around the links of this chain almost as soon as 
they are completed. These timing patterns imply that current patent 
applications are highly correlated with current R & D demand. In this 
context it should be noted that R & D itself is generally not available 
by product field, for smaller business concerns or, before 1972, for 
most large business enterprises. The availability of the patent data 
together with some of the qualitative results presented here should, 
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therefore, allow us to study the causes and effects of R & D activity in 
a much wider variety of situations, and in more detail, than has been 
possible to date. To use patent and R & D data jointly to distinguish 
between the different kinds of events that can cause changes in inven¬ 
tive activity (say demand shocks vs. technological or supply shocks), 
and then isolate their impacts on behavior and performance, seems to 
require a larger, and perhaps more detailed, model than the one used 
here. 
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The Social Costs of Monopoly and 
Regulation: Posner Reconsidered 

Franklin M. Fisher 

Mas\achw>etl\ Institute of Technology 


The traditional analysis of the costs of monopoly concentrates on the 
deadweight loss involved, monopoly rents being considered merely a 
transfer to the monopolist from the consumer surplus that would 
exist under competition. Some years ago, that analysis was challenged 
by Posner (1975), who presented an ingenious argument that monop¬ 
oly rents in fact measure the resources lost to society through rent- 
seeking activities and thus should be counted in the costs of monop¬ 
oly. That argument has recently been used by staff members of the 
Federal Trade Commission (Long el al. 1982, chap. 3, esp. pp. 77, 97, 
104; see also Tollison, Higgins, and Shugart 1983, pp. 23—44) in an 
attempt to estimate the benefits potentially flowing from the use of 
the FTC’s line-of-business program in antitrust enforcement. 

Unfortunately, Posner's argument, while a useful corrective to the 
traditional proposition that deadweight loss is all that matters, is not 
correct as a general analysis of the costs of monopoly, and conclusions 
based on it about the benefits of marginal changes in antitrust activi¬ 
ties are likely to be particularly fallacious. 

Posner's assumptions and conclusion are as follows; 

I. Obtaining a monopoly is itself a competitive activity, so 
that, at the margin, the cost of obtaining a monopoly is ex¬ 
actly equal to the expected profit of being a monopolist. An 
important corollary of this assumption is that there are no intramar- 


i am indebted to Richard Posner and George Sugler for comments, but I retain 
responsibility for error. 
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gtnal monopolies—no cases, that is, where the expected profits of 
monopoly exceed the total supply price of the inputs wed to obtain the 
monopoly. If there were such an excess, competition in the activity of 
obtaining the monopoly would induce the competing firms (or new 
entrants) to hire additional inputs in an effort to engross the addi¬ 
tional monopoly profits. 

2. The long-run supply of all inputs used in obtaining 
monopolies is perfectly elastic. Hence, the total supply price 
of these inputs includes no rents. 

3. The costs incurred in obtaining a monopoly have no 
socially valuable by-products. 

The first two assumptions assure that all expected monopoly rents 
are transformed into social costs, and the third that these costs 
do not generate any social benefits. [Posner 1975, p. 809; 
emphasis added] 

The problem, I believe, lies with the first assumption and the fact 
that the last statement therein does not follow and is unlikely to be 
true. To begin to see why this is the case, consider for a moment the 
standard result of competitive theory that profits are reduced to zero 
in equilibrium. That result follows even when firms are differentially 
situated, because such differences are defined as rents. Thus, a manu¬ 
facturing firm particularly well located ends up eat ning no equilib¬ 
rium profits in its manufacturing activity despite its favorable loca¬ 
tion, because we impute to the location the money that flows from that 
advantage and treat it as a cost when considering manufacturing. But 
rents are what Posner’s analysis is all about. If firms are differentially 
situated in terms of the ease with which monopoly can be obtained, 
then they will earn rents that will not represent social costs. 

Will firms be differentially situated? Posner plainly means to as¬ 
sume that they are not. His assumptions, even if true, are insufficient 
to guarantee this, however, not only because constant costs may in¬ 
volve imputed rents but also because (contrary to Posner’s assertion, 
p. 810) the assumption that inputs are available at constant prices 
does not imply that costs are constant. That conclusion also requires 
that the production function—here the production of monopolies— 
exhibit constant returns to scale, and Posner fails to assume this. 

As a matter of fact, such an assumption would not be a plausible 
one in most contexts. Consideration of what is involved requires a 
closer examination of what is meant by the assumption that “obtain¬ 
ing a monopoly is itself a competitive activity” so that there is ease of 
entry into that activity and profits are competed away. 

There are two possible ways to interpret Posner’s "production of 
monopolies.” The “competitive activity” involved is either that of ob- 
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taining monopolies generally or that of obtaining a particular monop¬ 
oly. It is useful to examine both versions. 

Suppose first that the activity involved is that of obtaining monopo¬ 
lies generally. Here the assumption that there is easy entry is plainly 
plausible; one can readily imagine potential monopolists searching 
for an appropriate area to monopolize. On the other hand, the as¬ 
sumption of constant costs—or of no rents—is not easy to maintain in 
that context. The supply of potential monopolies does not appear 
infinite. Some industries—the ones with higher entry barriers—are 
more readily monopolized than others or will yield higher monopoly 
rents for a given amount of resources spent in monopolizing them. 
This means that there are decreasing returns to monopolizing activity 
and inframarginal monopoly rents to firms that acquire the good 
monopolies. (I deal below with the fact that higher rents will call forth 
more effort in the securing of a particular monopoly.) 

More important, even weie there constant costs in the production 
of monopolies generally, it would not follow that monopoly rents 
corresponded to social costs. Consider the process through which 
profits are driven to zero in an ordinary competitive activity. In such 
an activity, when profits are being earned, new entrants come in and 
existing firms expand. The consequent expansion of supply bids 
prices down, reducing revenues, and the associated increase in the 
demand for factors bids input prices up, increasing costs. This goes 
on until profits have disappeared. 

Any arteinpt to describe this process when the activity is that of the 
general production of monopolies runs into immediate trouble. Even 
ignoring the fact that Posner assumes that input prices will not be bid 
tip. the desired conclusion will not follow. What is the “supply” of 
monopolies generally, the expansion of which will bring down price? 
Why should the possessor of a monopoly in one industry have his 
rents reduced because others are attempting to secure monopolies in 
other industries? Why should his costs be increased? Plainly, this in¬ 
terpretation cannot lead to Posner’s results. 

Suppose, then, that we consider not the obtaining of monopolies in 
general but rather the obtaining of a particular monopoly. In this 
case—even apart from the difficulty of defining successive units of 
“output”—constant costs cannot be a general property, nor can the 
activity be characterized as “competitive.” Competition involves free 
entry, and monopolies are typically characterized by barriers to entry 
with incumbents enjoying advantages over potential entrants. I'his 
means that the firm that is foresighted enough to enter such a monop- 
olizable area early will be able to monopolize it at a cost lower than 
that which latecomers would have to expend to wrest the monopoly 
away. This will result in a rent that will not be competed away by other 
potential monopolists. Not all of that rent need be the competitive 
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return to investment in information as to the availability of monopoly; 
some or all of it can perfectly well be traditional monopoly rent. Even 
where an oligopoly is involved so that the rent-to-entry-investment 
process is more continuous than in the single-firm monopoly case, 
entry barriers can relieve incumbents of the necessity of spending all 
their rents in the effort to protect them from potential rivals. 

Monopolies can also be obtained through luck rather than 
foresight. It is true that, as Posner says (1975, p. 812), if n risk-neutral 
firms each have an equal chance of obtaining a monopoly with a 
present value of V, each of them will be willing to spend V!n in an 
effort to secure the monopoly. Nevertheless, it does not follow that a 
total of V will in fact be spent (even apart from the question whether 
risk neutrality is a good assumption). Whether the total is spent de¬ 
pends on the mechanism that produces the monopoly. If the monop¬ 
oly is achieved before V is spent or if the marginal effect of expendi¬ 
ture on the chance of securing the monopoly falls to zero bef ore Vln is 
spent, then firms will not in fact spend so much. Only the unsup¬ 
ported assumption of constant returns in the activity of securing a 
particular monopoly produces a mechanism that leads to Posner’s 
result—and then only if one ignores the dynamics that may lead one 
firm to shut out others. 

Note, however, that the fact that resources are expended on the 
attainment of monopoly certainly means that there are some cases 
where what appears as monopoly rent understates the resources spent 
on rent-seeking activities. Predictions about monopoly profits can 
overestimate as well as underestimate the amount to be gained, and 
luck can be bad as well as good. In some cases (private subways in New 
York City seem a likely example), more will be expended on the rent- 
seeking activity than the actual amount that the rents turn out to be. 
There is still no mechanism that makes such rents exactly equal the 
costs and no general presumption that overstatement cases must bal¬ 
ance understatement ones across the economy. (Indeed, if monopo¬ 
lies keep on being sought and rent seekers are not risk loving there is 
a presumption that rents exceed costs.) 

The point is that once one starts to think of real examples, Posner’s 
result disappears as a general proposition. The Aluminum Company 
of America, for example, was well placed to monopolize because of 
the business it was in, an industry requiring particular mineral re¬ 
sources and cheap energy supply. It was in that business because of 
the patents it had originally obtained. The fact that it may have been 
drawn into patent research in aluminum by the possibility of monop¬ 
oly rents does not alter the fact that once it was in and had monopo¬ 
lized the business, no further entry into monopolization of aluminum 
was possible at the same cost, and no entry into the monopolization of 
other businesses—even businesses with equally attractive monopoly 
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rents—could bid away the monopoly rents already being earned in 
aluminum. The fact that a risk-neutral company would have been 
willing to spend the expected present value of all future monopoly 
rents to obtain the patents in the first place does not imply that they 
had to spend so much or that they or their potential rivals did so. 
Once the patents were obtained, at whatever cost, the future monop¬ 
oly rents were achieved and further expenditure by anyone was 
pointless. 

Much of this can be summarized by considering Posner’s statement 
that “at the margin, the cost of obtaining a monopoly is exactly equal 
to the expected profit of being a monopolist." If the activity involved 
is that of obtaining monopolies generally, then that statement is un¬ 
questionably true but has no bearing on the issue. If, on the other 
hand, the activity is that of obtaining a particular monopoly, then the 
statement is not true. In equilibrium the costs of wresting the monop¬ 
oly from the incumbent must be at least as great as the monopoly 
rents to be earned by doing so, but they need not be equal. This 
means that the incumbent can, in fact, be earning monopoly rents 
above the costs expended to secure them (the fact that he would have 
been willing to spend more if necessary has no bearing). Successful 
monopolists enjoy inframarginal rents, anti there is no general mech¬ 
anism that competes those rents away. 

I say no general mechanism, because there clearly are cases in 
which some such mechanism operates. These are the cases Posner 
appears to have in mind; they have to do with government-induced 
monopoly and with regulation. 

Potential monopolists are somewhat more likely to be on an equal 
looting where barriers to entry arise simply through government ac¬ 
tion than when such barriers arise for other reasons. The picture of 
resources expended on lobbying for a monopoly license until the 
eventually successful applicant has spent all the rents to be earned 
is one of some plausibility. The extent of that plausibility, however, is 
more limited than may at first appear. Before a monopoly license is 
given, all applicants may be on an equal basis. Once the license has 
been granted, however, regulatory authorities may be reluctant to 
transfer it. The Federal Communications Commission, for example, 
has almost never failed to renew the television license of an existing 
station. If incumbents have an advantage over potential replacements 
in the licensing process, it does not follow that their incumbency rents 
are no greater than the value of the resources expended to retain 
them, including the resources expended by unsuccessful applicants. 1 

1 Rogerson (1982) presents a formal model of the results of advantage to the incum¬ 
bent in the rent-seeking process. 
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Furthermore, those rents may exceed even the value of the re¬ 
sources expended to obtain the original monopoly license. It is hard 
to imagine, for example, that all or most of the originally successful 
applicants for broadcast licenses in VHF television correctly recog¬ 
nized in the 1940s the size of the rents eventually to be earned or that, 
if they did, they had to compete against a large body of unsuccessful 
applicants who shared that recognition. Moreover, there appears to 
have been some tendency for successful applicants to have been al¬ 
ready involved in radio broadcasting. To the extent that the FCC 
favored such applicants, they earned a monopoly rent even if all 
applicants recognized the value of television licenses. While it is true 
that such rent accrued by virtue of the earlier radio license, to attempt 
to make it the equivalent of social costs by arguing that radio license 
applicants all expected and competed for the rents later to lie made in 
television is to strain credulity. 

The general point of this example is as follows. Even where govern¬ 
ment regulation is involved in the production of monopoly, not all 
potential monopolists will be equally situated. While, in such contexts, 
Posner is undoubtedly correct that some resources are likely to be 
expended in getting and retaining the monopoly, and while the re¬ 
sources involved are likely to be greater in such contexts than in those 
areas that do not involve government support, it is still unlikely that 
the resources expended will match the rents to be earned. 

1 add one final point related to a use that others have made of 
Posner’s paper rather than directly to the paper itself. 2 Even were 
Posner’s entire analysis correct and applicable, it would not follow 
that the benefits of increased or better antitrust enforcement should 
be taken to include the monopoly rents being earned in those addi¬ 
tional industries where the improved enforcement restores competi¬ 
tion. I'he monopoly rents being earned in industries where antitrust 
cases are brought correspond (in Posner’s analysis) to resources al¬ 
ready wastef ully spent to achieve them. Those costs are investments in 
monopoly; they are generally sunk costs by the time of antitrust en¬ 
forcement and cannot be recovered by the removal of the resulting 
monopoly rents. 3 Only if the improved enf orcement mechanisms ap¬ 
ply to attempts to monopolize or if such attempts are deterred by the 
improvement can the monopoly rents avoided be said to correspond 
to social costs that are saved. Whether the deterrent effect of marginal 
improvements in antitrust enforcement is at all important is, of 


8 See the works on the Hne-of-business program cited in the opening paragraph of 
this Comment. t 

3 Ongoing expenditures to retain the monopoly would be saved, however. In this 
respect (which Posner does not consider), Posner's assumption (1975, p. 809, n. 3) "that 
the monopoly is enjoyed lor one period only" does affect the analysis 
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course, debatable; 4 even if that effect is important, decreasing returns 
to monopolization activity may mean that one catches large-rent mo¬ 
nopolies and deters small-rent ones. In any event, the social costs 
avoided through such deterrence cannot be measured by using the 
monopoly rents removed in the cases that provide the additional ob¬ 
ject lessons. 

In sum, Posner’s analysis does indeed show that the standard analy¬ 
sis of the costs of monopoly as measured only by deadweight loss can 
understate those costs.” While there are thus some circumstances in 
which some monopoly rents should be included in the construction of 
such a measure, it is an open question whether those circumstances 
are so general as to prompt the inclusion of all or nearly all such rents. 
Broad general theory will not provide the answer here; that answer 
must rest on a case-by-case analysis. 
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Estimates of the Deterrent Effect of Capital 
Punishment: The Importance of the 
Researcher's Prior Beliefs 
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Introduction 

Researchers from different social sciences approach the issue of the 
deterrent effects of capital and noncapital punishments with conflict¬ 
ing prior beliefs. Some believe that punishments deter and that social 
and economic variables have little or no influence on the murder rate. 
Others believe that punishments have little or no impact and that 
variations in the murder rate between states (or over time) can be 
explained largely by variations in economic and social conditions. 
Others hold somewhat different views. If controlled experiments 
could be designed to test the competing hypotheses, researchers 
could assess their validity. However, since research in the effec¬ 
tiveness of punishments in reducing the murder rate must be carried 
out in a nonexperimental setting, there is much uncertainty as to the 
“correct” empirical model that should be used to draw inferences, and 
each researcher typically tries dozens, perhaps hundreds, of specifica¬ 
tions before selecting one or a few to report. Usually, and under¬ 
standably, the ones selected for publication are those that make the 
strongest case for the researcher’s prior hypothesis. Because of this. 


I am indebted to Ed Learner, Sam Pekzman, and an anonymous referee for helpful 
comments on an earlier draft. Versions of this paper were presented at the UCLA 
Mathematical Economics and Econometrics Workshop, the Georgia State University 
Interdisciplinary Conference on Capital Punishment, and the Western Economic Asso¬ 
ciation Annual Meetings. Support from NSF grant SOC78-09477 is gratefully acknowl¬ 
edged. 
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research results are greatly discounted and even ignored by profes¬ 
sional readers, who are painfully aware, from personal experience, of 
the great amount of searching for a suitable specification that goes on 
behind the scene. 

In this study I use a Bayesian econometric technology, due to 
Learner (1978), to pool several possible alternative prior beliefs con¬ 
cerning the determinants of the murder rate with cross-state data 
front 1950. The approach is Bayesian in that the researcher’s prior 
beliefs about which variables belong in an equation explaining the 
murder rale are pooled with the data evidence and the results are 
summarized by the posterior distribution. In principle, this allows 
researchers with contrasting prior information or beliefs to reach a 
mutually accepted conclusion, if the data are sufficiently strong. Con¬ 
flicting prior beliefs need not lead to conflicting inferences, though 
they might. 

Statistical Background 

This section gives a brief description of the econometric procedure 1 
use in this study. For more detailed treatments see Learner (1978). 
The idea behind the procedure is that a Bayesian researcher with 
prior information about some of the parameters in a linear regression 
model will be led to summarize the evidence by considering a range of 
constrained least-squares estimates, depending on how he is willing to 
specify his prior information. The benefit of the Bayesian approach is 
that prior information, which is implicitly used in any interpretation 
of data evidence built on constrained estimates, is used explicitly. 

Suppose a researcher interested in estimating the effects of differ¬ 
ent factors on the murder rate has identified two sets of (potenlial) 
explanatory variables, one set which he is fairly certain belongs in the 
regression, and a second set of doubtful variables. The orthodox 
practice is to run regressions with every possible combination of 
doubtf ul variables. Such a practice has much to recommend it, if it is 
used honestly and wisely. Learner (1978) and Mayer (1980) suggest 
this practice if extreme estimates of parameters of interest are re¬ 
ported. Most often, however, a researcher will report only his “best” 
regression, from the point of view of confirming his hypothesis. At 
the very least, he ought to report his “worst” estimate as well. The 
benefit of Learner's procedure is that reporting of extreme estimates 
is simpler and more concise, 

The alternative technology proceeds by forcing the researcher to 
specify his prior beliefs in such a way that they can be explicitly pooled 
with the data. First, he needs to specify a set of doubtful variables, the 
associated parameters of which are thought to be small relative to 
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their standard errors. In the context of a linear regression, the 
coefficients on the doubtful variables are thought to be small. The 
researcher is willing to control for these variables, in a nonexperimen- 
tal setting, in any one of a number of ways. This limits the estimated 
coefficients to lie on what Learner calls the feasible ellipse, obtained by 
considering all possible linear constraints on the coefficients of the 
doubtful variables. 

Second, the researcher may pick a measure of doubtfulness—that 
is, of how far the estimates of the doubtful parameters stray from 
zero. For example, he may measure doubtfulness for a set of parame¬ 
ters as the sum of their squared deviations from the origin. This limits 
posterior estimates to what Learner calls the information contract 
curve. Finally, he can pick a unique prior standard error or a range of 
prior standard errors. This specifies how strongly he holds his prior 
beliefs about the smallness of the doubtful parameters and limits the 
posterior estimates to portions of the information contract curve. 

Quite often the extreme estimates over the feasible ellipse are 
highly unlikely from the point of view of the data. To prevent dog¬ 
matic priors from having an undue influence on inferences, the re¬ 
searcher can report extreme estimates from the feasible ellipse that 
are also constrained to lie within some (arbitrary) data confidence 
region. In the empirical application I present here, only the first form 
of prior information, identification of a set of doubtful variables, is 
used. This is supplemented with extremes constrained to lie within 
the 90 percent data confidence region. 

Empirical Application 

Data 

1 use aggregate data from 44 states in 1950 to estimate the ef fects of 
economic, social, and deterrence variables on the murder rate. 1 The 
murder rate is the FBI estimate of the number of murders per 
100,000 residents. Independent economic variables are the median 
income of families (in 1949), the fraction of families (in 1949) with 
income of less than one-half of the median income (a measure of 
income dispersion), the state unemployment rate, and the labor force 
participation rate. Independent social variables are the fraction of the 
state population nonwhite, the fraction of the population ages 15—24 
years, the fraction urban, the fraction male, the fraction of families 
with husband and wife both present, and a dichotomous indicator for 
southern states. 


Data are available from the author on request. 
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Deterrence variables are the focus of this study. Lengthof sen^ 
for murder is the median time served in months by prisoners con- 
victcd of murder who were released in 1951. The probability of con- 
viction for murder is estimated by the ratio of the number of convic¬ 
tions for murder in 1950 to the estimated number of murders in the 
state (the FBI estimate of murders per 100.000 times the population 
in 100 000s) The conditional probability of execution given convic- 
is estimated bv the average number of executions 1946-50 di¬ 


vided bv convictions. 

Of the 44 state s in the sample. 35 carried out at least one execution 
. ,046 and 1950: the other nine stares did not execute in this 

period. A dichotomous variable to mdtcate an executing state 
was also included in the analysis. 


Alternative Prior Beliefs 

To demonstrate how Bayesian statistical techniques cun be used to 
address the <feterrt*nce <jucstioi ). I propose five priors of repiesenta- 
live t eseats hers that t pool with the data. A longer list < onld t lent ly be 
used, hut I think these five are sufficient to give the flavor of the 
Bayesian approach. A shorter list also has merit, but one of the 
strengths of the Bayesian techniques 1 use is the ability to handle 
numerous alternative priors in a unibed framework. I have given 
each of the priors a short name simply for ease of exposition, not to 
suggest value judgments. 

1. I he right-winger: This researcher believes (hat the deterrence 
variables belong in the murder rate equation and that the economic 
and social variables are doubtful. The economic and social variables 
may he influential, hut the right-winger is willing to control for them 
in any one of a number of ways. 

2. I he rational maximizer: This researcher has an economist's view 
of crime. He believes that punishments affect the murder rate 
through their influence on individual murder “supply” and, to the 
extent that murders are associated with property crimes, that the 
economic variables will affect the murder rate through that channel. 

3. Eye for an eye: A researcher with this prior treats length of 
sentence as doubtful along with economic and social variables. He 
holds that only the probability of receiving capital punishment can 
deter murderers. 

4.1 he bleeding heart: Deterrence variables and social variables are 
doubtful. If the economic conditions of individuals with a current 
high propensity to commit murder were improved, then the social 
variables would not matter. In addition, the bleeding heart believes 
that punishments are ineffective in deterring murders. 

5. Crime of passion: This researcher considers murders as largely 
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TABLE I 

Treatment of Variables by Different Priors 


Prior 

Deterrence 

Variables 

Economic 

Variables 

Social 

Variables 

Right-winger 

Important 

Doubtful 

Doubtful 

Rational maximizer 

Important 

Doubtful 

Doubtful 

Eye for an eye 

* 

Doubtful 

Doubtful 

Bleeding heart 

Doubtful 

Important 

Doubtful 

Crime of passion 

Doubtful 

Important 

Important 


Note —Deiermnc variables probability of conviction, probability of execution (given conviction), and months 
of pi won sentemr Economic variables median family income, fraction of families w«h less than half the median 
income, unemployment rate, and labor force partwipauon rate Social variables fraction nonwhite, fraction ages 15- 
24 years, fraction urban, f'raition male, fraction of husband and wile both present in families, southern state 
indicator 

•The eye-for-an-eve prior treats the probability of conviction and the probability of execution as important 
variables and the length of sentence as a doubtful variable 


acts of passion, not as the result of rational calculation of costs and 
benefits. He thinks that the coefficients of the so-called deterrence 
variables are thus likely to be small and insignificant. On the other 
hand, the economic and social variables are likely to be influential, 
since they are proxies for the propensity to violent outbursts that 
could result in murder. 

Table 1 presents a summary of the treatment of variables by re¬ 
searchers with the several priors. This list of potential priors is by no 
means exhaustive, but there are enough different viewpoints repre¬ 
sented to give the flavor of the Bayesian analysis.* 

A serious problem I faced in specifying the priors was how to treat 
the dichotomous variable identifying executing states from the 
nonexecuting states. Ehrlich (1977) argued that this variable ought to 
be included to prevent a specification bias associated with its omission. 
This assumes that the variable belongs in the “true” equation, but 
specification bias can result f rom inclusion of an inappropriate vari¬ 
able as well as from exclusion of an appropriate variable. Thus, I have 
pooled each prior with the data two ways: first with the indicator on 
the included list, and second with the indicator on the doubtful list. 


Pooling Data and Pnors 

The data information is summarized by an unconstrained regression 
of the murder rate on all of the deterrence, economic, and social 

2 Some of the priors selected will always have more extreme estimates than others, 
because they are nested. For example, the right-winger and bleeding heart will both 
have more extreme estimates than the rational maximizer, since they differ from the 
rational maximizer only in that they both treat more variables as doubtful. A prior 
nested in another will suggest a wider range of potentially acceptable models and thus 
wider extreme estimates of parameters of interest. 
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TABLE 2 

Extreme Estimates over the Feasible Eilipse and within the 90 Percent Data 
Confidence Region Including Executing State Indicator 
in All Specifications Considered 



Effect of 

Variable on Number oe Murders 

Alternative Prior Bf.liefs 

Convictions 

Executions 

Sentence Length 

Right-winger: 

Maximum estimate 

-.22 

- 1.16 

- 16 

Minimum estimate 

- 2.50 

-22.56 

- 1.45 

Difference 

2.28 

21 40 

1.29 

Rational maximizer: 

Maximum estimate 

-.72 

- 10.24 

-.38 

Minimum estimate 

- 1 35 

- 15.91 

- 86 

Difference 

.63 

5.67 

48 

Eye tor an eye' 

Maximum estimate 

.22 

- 35 

.88 

Minimum estimate 

-2 57 

- 26.20 

- 1 55 

Difference 

2.79 

25.85 

2.43 

Bleeding heart: 

Maximum estimate 

1.06 

12.37 

.51 

Minimum estimate 

- 2.29 

- 25.59 

- 95 

Difference 

3.26 

37.96 

1.46 

Crime of passion: 

Maximum estimate 

35 

4 10 

19 

Minimum estimate 

- 1 49 

- 17 32 

- 63 

Difference 

1.64 

21 42 

82 

Data estimate 

- 1.14 

- 13.22 

-.44 

(SE) 

(.92) 

(7.20) 

(.28) 


variables. The least-squares estimates of the deterrent effects of capi¬ 
tal and noncapital punishment are reported in table 2 as the number 
of murders prevented for each execution, conviction, or added 
month of prison sentence. 3 The estimated deterrent effects are that: 
for each additional conviction for murder, 1.14 murders are pre¬ 
vented, with a standard error of 0.62; for each additional execution, 
13.22 murders are prevented, with a standard error of 7.20; and for 
each month added to the median prison sentence for murder, 0.44 
murders are prevented, with a standard error of 0.28. 

Table 2 gives the extreme estimates, within the 90 percent data 
confidence region, of the effects of deterrence variables on the num¬ 
ber of murders for the five priors. The executing state indicator was 
included as an important variable in all possible specifications. The 
numbers reported are the effect of the variable in question on the 


5 The equation was estimated in a different form, but the transformation allows me 
to ask, How many murders will be prevented for each execution (conviction, added 
month of sentence)? 
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number of murders, so that negative numbers are murders prevented 
and positive numbers are murders encouraged. Under each prior the 
extreme estimates of each effect are reported, followed by the abso¬ 
lute difference between the maximum and minimum estimates. The 
absolute difference between the extremes can be thought of as the 
specification uncertainty regarding the parameter, to be compared to 
the sampling uncertainty. 

Consider the effect of an extra execution on the number of mur¬ 
ders. Table 2 indicates that the choice of prior is important, because 
the specification uncertainty is great for some priors, and because 
different priors yield widely different estimates. The right-winger, 
rational-maximizer, and eye-for-an-eye priors bound the effect of an 
execution away front zero. The bleeding-heart and crinte-of-passion 
priors, in contrast, do not bound the effect of an additional execution 
from zero. 

The largest estimated deterrent effect of executions, -26.20, is 
under the eye-for-an-eye prior, which seems sensible since this prior 
holds that only executions and convictions affect the murder rate. 
The largest positive effect of executions on murders (which might be 
called an encouragement effect rather than a deterrent effect) is 
12.37 under the bleeding-heart prior. However, the bleeding-heart 
prior has a minimum estimated effect of — 25.59, or a deterrent effect 
almost as large as the largest under the eye-for-an-eye prior. 

Looking at the various extreme estimates in table 2, I conclude that 
significant conflicts remain over the estimated deterrent effect of an 
additional execution, even after the researchers have confronted the 
same data. The conflicting interpretations of the data evidence are 
serious. The right-winger, rational-maximizer, and eye-for-an-eye re¬ 
searchers will conclude that zero or positive effects of executions are 
impossible, while researchers with the other two priors will conclude 
that zero, positive, or negative effects are all possible. Another conflict 
exists between the eye-for-an-eye and the bleeding-heart priors since 
they lead to the most extreme estimates of all the priors. Choice of 
prior is clearly important, but the data do not give strong direction in 
selecting a prior. 

Conflicts over the interpretation of the data evidence do not stop 
with the effect of executions. Similar conflicts involve the effects of 
convictions and months of sentence. In table 2, the right-winger and 
rational-maximizer priors bound both effects away from zero, but the 
other priors do not. The choice of prior is important f or these effects 
as well as for the ef fects of executions, but the data do not help in the 
choice of prior. 

Table 3 repeats the exercise of table 2, with the exception that the 
executing state indicator is treated as a doubtful variable by each 
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TABLE3 


ExTRl ME Lsi IMA CCS OIV 7 HE F^VBLE Ell fWl AND H77WN TH^PtKCEVT- DATA 

Confidence Region Treating Executing Si All. Indicator 

x Hoi m u J VaRMBJ-E FOR £a(.H PRIOR 


—--- 

Etna ot 

Variable on Number ok Murders 

Alternative Prior Hu nts 

('.(invidious 

Executions 

Sentence Length 

Right-winger. 

Maximum estimate 

-it 

7.84 

-.06 

Minimum estimate 

- 2.63 

-23 40 

- 1.53 

Difference 

2.49 

31.24 

1 47 

Rational maximize!.' 

Maximum estimate 

- ,5.‘t 

4.66 

-.21 

Minimum estimate 

- 1.73 

- 15.90 

- 1.11 

Difference 

1 20 

20.56 

.90 

F.ye (or an eve. 

Maximum estimate 

.23 

7.87 

.89 

Minimum cstimaie 

- 2 65 

-29.10 

- 1 60 

Diffeteme 

2.88 

36.97 

2 49 

Bleeding heart- 

Maximum estimate 

1 53 

17.80 

72 

Minimum estimate 

-2 70 

-31 40 

- 1.17 

Difference 

4 2.3 

49.20 

1.89 

Crime of passion 

Maximum estimate 

.56 

6.54 

.28 

Minimum estimate 

- 1 70 

- 19.77 

- 72 

Difference 

2.26 

26.31 

1.00 

Data estimate 

- 1.14 

- 13.22 

-.44 

<St.) 

(.62) 

(7 20) 

(.28) 


prior. The differences between tables 2 and 3 are startling. If the 
executing indicator is a doubtful variable, then none of the priors can 
bound the deterrent effect of executions from zero. The largest es¬ 
timated deterrent effect in table 3 is under the bleeding-heart prior 
( — 31.4), though the eye-for-an-eye prior is not far behind (-29.1). 
I'he greatest encouragement effect is under the bleeding-heart prior 
(17.8). The minimum (negative) estimates in table 3 are not 
significantly different from the minimum estimates in tabie 2, but all 
the maximum estimates are seriously affected. Specification of prior 
beliefs concerning the executing indicator is very important, espe¬ 
cially for the priors that could bound the deterrent effect away from 
zero with the indicator treated as an important variable. 

In table 3 the bounds for the effects of convictions and months of 
prison sentence are not significantly different from those in table 2. 
The main effect of treating the executing state indicator as doubtf ul is 
to increase greatly the uncertainty about the deterrent effect of ex¬ 
ecutions. 

In the single-equation context of this study, the interpretation of 
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the executing state indicator relates to the functional form of the 
effect of the probability of execution on the murder rate. The ques¬ 
tion is whether the probability of execution has a linear effect on the 
murder rate or a more complex functional form. The states that do 
not execute in the sample have a lower murder rate than the execut¬ 
ing states (1.9 vs. 6.2), so they are outliers in both the murder rate and 
the probability of execution. If there is uncertainty regarding the 
functional form, then outliers produce the familiar “dumbbell” re¬ 
gression problem where little can be inferred about signs and mag¬ 
nitudes of coefficients. 


Conclusions 

Several alternative prior beliefs that different researchers might ap¬ 
proach the deterrence issue with were pooled with data for states in 
1950. The paper demonstrates how Bayesian statistical methods can 
be used to shed light on the importance of researchers’ prior beliefs in 
empirical work. The data analyzed are not sufficiently strong to lead 
researchers with different prior beliefs to reach a consensus regard- 
ingthe deterrent effect of capital punishment. Right-winger, rational- 
maximizer, and eye-for-an-eye researchers will infer that punishment 
deters would-be murderers, but bleeding-heart and crime-of-passion 
researchers will infer that there is no significant deterrent effect. 

If researchers treat the executing indicator as a doubtful variable, 
then they will all infer that there is not a significant deterrent effect of 
capital punishment. The importance of this indicator in drawing in¬ 
ferences suggests that the single-equation framework used here, and 
in many other studies of the determinants of the murder rate, may be 
inadequate for resolving the issue. 
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Democratic Economic Policy. By Bruno S. Frev. 

New York: St. Martin’s Press, 1983. Pp. viii + 303. $32.50. 


This book presents a radical departure from the conventional theory of eco¬ 
nomic policy formation in which an exogenous government is viewed as cor¬ 
recting market failure via informed interventions. Frey treats government 
instead as an endogenous agent within the politico-economic system, acting 
not at all autonomously but rather under the influence of myriad different 
forces. In this process, voters, bureaucracy, private interest groups, political 
parties, and economic advisers all play a role, with their behavior best ana¬ 
lyzed by reference to the theory of public choice and with emphasis placed on 
the equilibrium characteristics of political markets. The book is divided into 
four parts. In part 1, economic policy is analyzed within a politico-economic 
framework and a distinction is drawn between derision making that occurs at 
the level of social consensus and that which takes place in the current politico- 
economic process. Part 2 deals with institutions at the level of the social 
consensus, evaluating the term of social contracts that establish rules con¬ 
straining the political protess. Part 3 analyzes economic policy interventions 
in the current politico-economic process. Part 4 discusses the role of economic 
policy advising and the status, role, and incentives of the economic policy 
advisers on whom Frey would place a considerable responsibility for improv¬ 
ing the quality of democratic economic policy. 

Despite the apparent promise offered by such a text, and the transparent 
academic integrity of its author, in my view the book is seriously flawed, as a 
consequence of the author’s own background exposure path to public choice, 
his less-than-sure grasp of the current public choice approach, and his failure 
to press sufficiently fai the self-seeking axiom on which his analysis is predi¬ 
cated. The force of these criticisms will become apparent as this review' un¬ 
folds. 

First, let us examine Frey’s interpretation of policy-making by unanimous 
social consensus, which draws heavily (but in my view misguidedly) on the 
contributions inter alia of John Rawls (1971) and Geoffrey Brennan and 
James Buchanan (1980), which emphasize the importance of the veil of igno¬ 
rance as a basis for constitutional contract. Repeatedly Frey relies on the 
existence of uncertainty as the force that drives citizens into protecting them¬ 
selves against potential adverse outcomes via a system of "higher-level” con¬ 
stitutional rules. He stresses the role of economic advisers (of which more 
later) in suggesting contracts from which all such individuals in the state of 
uncertainty would assume that their position will be improved given that such 
contracts would be abrogated only by a qualified majority vote of the elector- 
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ate. Unfortunately, Frey appears really to believe that the veil of ignorance 
serves such a purpose in the real world, apparently failing to recognize that 
the veil itself as developed by Rawls is a philosophical hypothetical designed 
to derive the principles that satisfy criteria of fairness. Even Brennan and 
Buchanan (1980), who employ Rawls’s hypothetical extensively in their 
theory of government as a tax-revenue-maximizing Leviathan, recognize ex¬ 
plicitly that ‘‘description of the idealized setting is useful primarily as an 
analytical benchmark rather than as a model that is expected to prevail.” Of 
course these authors are correct, since the veil almost always is lifted in the 
real world and individuals are provided with some information concerning 
their probable life chances. Indeed, it is precisely because there are expected 
gainers and expected losers from constitutional rules that such rules are so 
difficult to establish in practice, at least on the basis of unanimous consent. 

Insofar as constitutional rules are seriously countenanced in the real world, 
usually it is not in response to the veil of ignorance uncertainty. Rather it is a 
response both to the recognition that gains from trade exist from protecting 
certain transactions from the pressure of in-period politics by denying a med¬ 
dlesome majority its wealth destructive opportunities and to the fear that 
governments, in the absence of constraints, may pursue “worst-case” policies. 
For, as Brennan and Buchanan have argued elsewhere (1983), “the model of 
human behavior appropriate for comparative institutional analysis will gener¬ 
ally be one that generates worse outcomes . . . than the empirical record 
would justify.” Thus also David Hume (1963): “In constraining any system of 
government and fixing the several checks and controls of the constitution, 
every man ought to be supposed a knave, and to have no other end, in all his 
actions, than private interest.” 

Frey also glosses over the Prisoner's Dilemma obstacle to social consensus 
where gains from trade are apparent. Even in such circumstances, as Bu¬ 
chanan, Tollison, and Tullock (1980) have demonstrated, if the payoff matrix 
is asymmetrical and consistent winners and losers are evident the side pay¬ 
ments required for consensus may well not be forthcoming and the constitu¬ 
tional solution thus may lie obstructed. Once again, Frey introduces an im¬ 
portant contept in public choice but then fails to note recently developed 
caveats concerning its applicability to social consensus outcomes in the real 
world. 

A not dissimilar weakness is evident in Frey’s treatment of pressure groups. 
Frey praises citizens' initiatives based on collective action as beneficial lor 
democracy and as constraining the behavior of in-period governments. He 
(orrectly recognizes the logic ol collective action as outlined by Olson (1965), 
which implies that pressure groups will develop unevenly and demonstrate 
uneven political success in response to their respective abilities to privatize 
benefits or to coerce memberships. His proposed solution, however, whereby 
economic advisers would inform the unorganized of the benefits of collectivi¬ 
zation, is naive. If the benefits exist and organization is feasible, entrepre¬ 
neurs will already be organizing such groups. Typically, information is not 
the obstacle; rather, the free-rider problem associated with high transactions 
costs denies certain potential collectives the fruits of political participation. 
Moreover, even where collective action is widespread, public choice analysis 
usually does not predict that it leads to the gains-from-trade solution. In 
many cases, it will collapse into the negative-sum supergame ol the Prisoner's 
Dilemma. 

The market alternative to the political route to social consensus is also 
evaluated by Frey, quite appropriately within the f ramework of a comparative 



JOURNAL OF POLITICAL ECONOMY 


428 

institutions analysis. But here there is no recognition of the recent literature 
on rent seeking, which transforms the conventional public finance discussions 
from which Frey’s own analysis clearly steins. The notion that social consensus 
on market rules, designed to eliminate market failure, will be forthcoming if 
information is made sufficiently available once again conflicts with recent 
public choice results that indicate that severe Prisoner’s Dilemma obstacles 
exist to the political resolution of such problems, even when transactions costs 
are not prohibitive. Frey also ignores entirely a growing view in normative 
public choice to the effect that “whatever is, is efficient" in a comparative 
institutions setting. This is not a view 1 share, but it is worthy of some consid¬ 
eration in a study of democratic economic policy, most especially within the 
context of the social consensus discussion. 

Frey's evaluation of bureaucratic decision making and the problems it poses 
for social consensus is much more convincing. Drawing on the contributions 
by Williamson (19fi4), Niskanen (1971), and Orzechowski (1977), he clearly 
exposes the discretionary nature of most bureaucratic decision making to¬ 
gether with the self-seeking nature of the bureaucrats themselves. Frey does 
not evaluate the recent applications of principal/agent analytics to bureau¬ 
cratic behavior, but, arguably, this does little more than to consolidate the 
principal findings of the above-mentioned contributors. However, his pro¬ 
posed solutions to tfie bureaucracy problem, which relate to increased compe¬ 
tition between administrative units and to the use of direct volet' referenda on 
administrative issues, in mv view are unlikely significantly to alleviate the 
control loss problem imposed by government bureaucracy on the democratic 
political process. 

Now let us turn 10 Frey’s analysis of economic policy interventions in the 
curient politico-economic process, given the framework of rules derived via 
social consensus. In this section of his book, Frey evaluates the prospect of 
influencing behavior through the provision oT information, analyzes alterna- 
live methods for collecting information on citizen preferences on behalf of 
government and its bureaucracy, and outlines the most effective means for 
applying public policy. Frey recognizes that information lends to be dis¬ 
seminated unevenly through the various agents in the political process and 
recognizes the role of the free-rider problem in this process. In my view, some 
of his information “solutions" such as, for example, the use of state-controlled 
radio and television “10 guarantee a diversity of opinions” are naive, and the 
role that fie would assign to economic advisers is potentially dictatorial. Much 
of it smacks indeed of Arrow's social decision maker, who is supposed to read 
individual preferences from the domain and transmit them suitably massaged 
to the range of social c hoice. This is anathema to the public choice approach, 
at least as it has been developed by the “Virginia School." Frey suggests that 
the economist's role is to advise the political parties as to the preferences of 
the voters over policy issue space while relying on competition in the political 
marketplace to ensure compliance between voter preferences and public pol¬ 
icy. Such a view ignores the fact that only some majority preference at best 
will be reflected by competing parties and indeed the finite probability that 
spatial immobility may prevent political market convergence and produce a 
minority preference solution. It also ignores the ability of political parties to 
force full-line policy programs, to take advantage of voter memory-decay 
rales, to exercise policy discretion, and to succumb to bureaucratic and pres¬ 
sure group subversions of their policy platforms. The chapter on economic 
policy instruments reverts entirely to conventional public finance with the 
emphasis on the efficiency characteristics, for example, of tax-price as com¬ 
pared with regulatory solutions in externality situations and with only a pass- 
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ing reference to the public choice obstacles to their effective introduction and 
implementation. 

The final section of the book attempts to tackle the serious problem of 
incoherence that threatens any work of this kind. Specifically, if everything 
within the political process is treated as endogenous, then the system is fully 
explained and there is no effective route to change the political outcome: 
what is, simply is. To discuss policy from an interventionist standpoint in such 
circumstances negates the theoretical thrust. Understandably, those who de¬ 
sire change in such circumstances are forced to recognize exogenous forces of 
one kind or another, be they the courts or voter preferences or even revolu¬ 
tion and coup d’etat. For Frey, the economic adviser assumes this function, 
offering deals and information designed to shift political equilibrium. But 
Frey’s solution does not convince. If economic advisers are employed by 
special interests, as Frey recognizes, they will reflect their paymasters’ inter¬ 
ests, perhaps intensifying negative-sum games within society. If they are em¬ 
ployed by the state, they will reflect the short-term interests of the in-period 
government, and this irrespective of free entry into economic advice or of the 
apparent independence of their organization (as the U.S. Council of Eco¬ 
nomic Advisers clearly evidences). 

This does not imply that independent research will not exist or that eco¬ 
nomic ideas will not from time to time direct the political process. The mone- 
taiy revolution and the deregulation literature are witness to the power of 
outside ideas. Scholars working in such fields from initial minority positions 
are the exogenous entrepreneurs of political change uncaptured by the exist¬ 
ing political equilibrium. As yet we have no real theory to explain such behav¬ 
ior, perhaps fortunately no method of endogenizing their existence. To sug¬ 
gest that economic advisers could be paid from within the political system to 
simulate this function is to ignore the message ol public choice and even to 
engage in rent seeking on behalf of the economics profession. 

Charles K. Rowley 

George Mason University 
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Money and Value: A Reconsideration oj Classical and Neoclassical Monetary Eco¬ 
nomics. By Jean-Michel Grandmont. 

New York: Cambridge University Press. Pp. 199, $29.95. 

This is a beautifully written book. There is hardly a subject the author touches 
on that he does not enhance with the clarity and elegance of his exposition. 
Though it places much less emphasis on mathematical technique, the book 
stands comparison with the classic texts of Dehreu (1959) and Hildenbrand 
(1974). 

The material on which the book is based comes from a series of papers 
written by Grandmont (1974) and Grandmont and Laroque (1975, 197b) in 
the early seventies. The centra! problem is the existence of monetary equilib¬ 
rium. The essential features can be illustrated adequately by a simple ex¬ 
ample. Imagine a pure exchange economy in which there are two periods and 
two commodities, a consumption good and fiat money. A typical consumer 
has an endowment of the good in each period and an endowment of money 
in the first. Utility is a function of consumption in each period. Money is held 
in order to transfer wealth from one period to the next. The expected price 
of consumption in the second period is a function t|<(/;) of the consumption 
good’s price p in the first period (This is a model of temporary equilibrium.) 
The consumer’s demand for goods and money ran be written as a function ol 
his endowments of goods, his initial real balances m/p, and the relative price ol 
consumption in the two periods yp(p)lp. x To achieve equilibrium in the first 
period relative to given expectations about the second, we have to rely on 
changes in the single endogenous variable p. A change in p has two immediate 
ef fects on the consumer. It alters his real endowment of money nt/p (the real 
balance effect [RBE]) and it alters the relative prices of present and future 
consumption <\>(p)/p (the intertempoial substitution ef fect flSE]). For a given 
expectation function the 1SF. may be weak. For example, in the case of static 
expectations t| >(p)!p s 1. There is no ISE and we must rely entirely on the 
RBE. Unfortunately the RBE may not be equal to the task when c| i(p)ip s 1. It 
is easy to find cases where the consumer always wants to consume more than 
his endowment in period 1. Then he must he a net supplier of money. II this 
is true of every consumer, there will be an excess supply of money at every 
price level. In otder to show the existence ol equilibrium, therefore, one 
needs a sufficiently strong ISF.. This can be obtained only by restricting the 


1 "The consumer’s decision problem is 

max u(C|, c\d 


s t. p(r i - e,) + m — m £ <1 
Mp)(< > ~ e. t ) - m S 0 
( r i, cj, in) s 0. 

If m > 0 al the optimum the demand foi consumption goods is determined by the 
problem 

max w(r,, o,) 


s.t. (c. 


«i) + 


Mp) 

P 



and the derived demand for real balances can be solved from die budget constraint: t nip 
= (M/p) - (c, - *|). Clearly <c x , and m/p are functions of n, e. 2 , ~rh/p, and i|i (p)/p. 



BOOK REVIEWS 


431 


form of the expectation function. Grandtnont assumes that, for at least one 
individual, 4 <(pVp approaches zero (infinity) as p approaches infinity (zero). In 
other words, when p is very large (small) future consumption is very cheap 
(expensive) relative to present consumption. An excess supply of money 
raises the price level, which leads agents to delay consumption, thus increas¬ 
ing the demand for money. An excess demand for money similarly leads to a 
fall in prices, reducing the demand for money as agents bring forward their 
consumption. 

From this solution of the existence problem Grandmom draws two impor¬ 
tant conclusions for “neoclassical” monetary theory. First, the emphasis 
placed by Patinkin (1965) and others on the RBE as the central equilibrating 
mechanism is misplaced. Second, the theoretical case for the neutrality of 
money is undermined, since one of the necessary conditions for neutrality is 
unit-elastic expectations. 

The problem of proving the existence of an equilibrium is often seen to be 
a dry and purely technical one. This is merely a carirature of the truth. 
"Existence” is fundamental to any general equilibrium theory. It alone estab¬ 
lishes the consistency of the relationships that are assumed to determine 
equilibrium. In any case, the existence "theorem” is only the hard result that 
comes at the end of a process of developing and motivating a substantive 
economic model, a process that can never be treated in a purely technical way. 
And as we have seen above, the analysis of the existence problem throws up 
important economic insights. In all these respects, the present work is exem¬ 
plary, but it goes even further in demonstrating the potential richness of the 
analysis of “existence.” Much of the book is devoted to showing that impor¬ 
tant questions about the efficacy of monetary policy can be reformulated and 
analyzed as questions of existence. Can the government control the money 
supply? This is really a question about the existence of equilibrium when the 
government's net supply of money is fixed. Similarly the question whether 
and to what extent the government can control the interest rate resolves itself 
into a question of the existence of an equilibrium when the government 
supplies bonds elastically at the given interest rate. The analysis ignores dy¬ 
namics, of course, but it is an important step to have established the possibility 
of equilibrium-consistent policies. Again, the nature of expectations is crucial. 
It is necessary to restrict the expectations function of every agent, not just 
one, and to assume that expectations are bounded above and bounded away 
from zero. 

The final application ol these techniques is to the question ot the liquidity 
trap. The liquidity trap cannot, as some have suggested, prevent the existence 
of equilibrium. Only discontinuities can do that. But it can be shown that as 
the target interest rate approaches zero the required increase in the money 
supply becomes unboundedly large. Of course, under such conditions the 
boundedness of price expectations becomes rather unreasonable. One is led 
to speculate that the dependence of expectations on money supply may be the 
source of an even more intractable liquidity trap. 

Expectations clearly play a crucial role at every stage of the analysis in this 
book. It is most unfortunate that both the general treatment of expectations 
and the particular assumptions made about them have been overtaken by the 
rational expectations hypothesis (REH). Since tfte original papers were pub¬ 
lished, the REH has beeome the generally accepted tool for modeling expec¬ 
tations. Whatever stories may be told about learning from past experience, 
there is no serious theory in these pages to explain where the expectation 
lunction 4 1 comes from. Given the strong restrictions placed on 4>. it is most 
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unlikely 10 be consistent with the REH. T here is an ad hoc air about the whole 
construction that makes the theory less persuasive than it should be, despite 
the generality and rigor of the analysis in other respects. 

Ilie temporary equilibrium framework is not fortuitous, however. It is de¬ 
liberately adopted to avoid a fundamental problem. In what is probably 
the seminal paper tin the modern general equilibrium theory of money, 
F. H. Hahn (1965) pointed out that every monetary economy contains within 
it a nonmonetary economy. Whatever service is provided by money, rational 
agents hold money only because they can exchange it for something else. If 
the value of money is zero it serves no function and (real) demand for money 
must lie zero. But the (real) supply is also zero, so the demand for and supply 
of money are identically equal. The markets for other commodities will clear 
under standard assumptions, and the resulting equilibrium is, in an obvious 
sense, a nonmonetary equilibrium. By taking expectations to be exogenous, 
Grandmont is able to finesse this problem: money can be guaranteed to have 
value by assuming that it is expected to have value in the future. With rational 
expectations there is no way of avoiding the Hahn problem, and this leads to 
enormous technical difficulties. Irrational behavior seems loo high a price to 
pay lor avoiding them, though. 

Even on its own terms, the assumption of bounded expectations is question¬ 
able. To justily it, one presumably appeals to the fact that prices normally lie 
within some fairly nairow baud. But this appeal to normal experience has 
fence only il the prices currently observed lie within the "normal" range. 
Giandmont, on the other hand, assumes that expectations are bounded what¬ 
ever prices are currently observed. Furthermore, in order to get a sufficiently 
large ISE to clear markets, it may be necessary that current equilibrium prices 
lie fat beyond the normal range. To avoid this evidently unsatisfactory prop¬ 
erty one would have to show that prices and expected rates of inflation or 
deflation lie within the normal range, in which case the boundedness assump¬ 
tion would be superfluous. The reliance on bounded expectations is a poor 
substitute for analyzing what features of the underlying structure give rise to 
normal experience. 

One does not have to look far to find reasons why the rate of inflation or 
deflation is bounded. One ts the existence of alternative assets. If the rate of 
deflation becomes too large, no one will hold any productive assets and there 
must be excess demand for goods. This cannot happen in equilibrium. Simi¬ 
larly, if the rate of inflation rises too far, individuals will cease to bold money, 
preferring to hold goods instead. A sensible theory has natural bounds on the 
expected rate of inflation or deflation, ft does not rely on an arbitrarily large 
ISE derived front ad hoc assumptions on ex pet tat tons. Of course, it is not 
easy to introduce alternative assets into these models. Grandmont does have 
tionds or credit in some of his models, but the treatment is either restrictive or 
unsatisfactory. In chapter 2 tfiere is a credit market, but individuals are al¬ 
lowed to borrow only from the government, so money is the only asset they 
“hold." In chapter 4 he introduces perpetuities, which are demanded because 
their coupon is offset by expected capital losses. This is Keynes’s speculative 
motive, of course, and depends crucially on bounded expectations. But here 
the expectations are severely irrational. In order for any individual to hold 
money he must expect, with positive probability, a capital loss greater than the 
coupon. But then for consistency he must expect that, again with positive 
probability, the price of bonds will be negative in finite time. Even the most 
weakly rational agent cannot be expected to hold such beliefs. Nor are things 
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made any better by introducing bonds with a finite date to maturity. To 
reconcile money and other assets satisfactorily one needs transactions costs. 

To sum up, there are at least three respects in which the present theory 
needs to be improved. The arbitrary exogenous expectation function ought 
to be replaced by the REH. Then substitution between money and assets 
should be used in place of the arbitrarily large ISE to attain equilibrium. 
Finally, one needs transactions costs or something similar to explain why 
money is held when there are alternative, interest-bearing assets. Some prog¬ 
ress in this direction has been made by Martin Hellwig and myself (1984). 
Although we deal with rather different models we have had to face many of 
the problems encountered by Grandmont and use similar techniques to re¬ 
solve them. Grandmont’s pioneering work has allowed others to see more 
clearly what needs to be done and how to do it. For this he deserves the 
admiration and gratitude of all economists. 

Douglas Gale 

London School of Economics 
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Optimal Severance Pay with 
Incomplete Information 


Charles M. Kahn 

University of Chicago 


This paper contrasts optimal employment contracts in a world of full 
information and in a world in which the worker’s alternative employ¬ 
ment possibilities cannot be observed by the firm. The difference 
between income and severance pay—that is, the opportunity cost of 
separating—will be higher in more productive states of nature. With 
complete wage insurance no longer possible, we therefore have an 
explanation of higher-ability individuals receiving high wages in the 
short run in long-term contracts. The explanation does not rely, as 
past explanations have done, on the imperfections in bonding ar¬ 
rangements or in loan markets. 


A recent popular approach to the study of long-term labor arrange¬ 
ments is to regard them explicitly or implicitly as a form of insurance 
of workers by firms. A major embarrassment of this approach is that 
the insurance seems so weak. 

Consider an individual embarking on a career. Although he may 
know the average prospects for advancement in this lield, he will not 
know his own prospects. It would be natural then for such an individ¬ 
ual to desire insurance—a chance to trade income if successful for 
income if not successful. Moreover a large lirni should be able to offer 
such insurance to its new employees. It takes on a number ol entry- 
level employees each year and should be able to establish an actuar- 
ially profitable risk-pooling arrangement. 

1 am gratetu! to Joseph Altonji, Gary Betkcr, Russell Camper, Shcrwin Rosen, Jose 
Scheinkman, Robert Topel, and Timothy W'eilliers as well as to participants in semi¬ 
nars at the Universities of Chicago, Iowa, Minnesota, and Pennsylvania. This research 
was supported by NSF grant no. SES-8208767. 

[Journal of Political Economy. 1985, vtil 95, tao 3| 

© 1985 by 'Ihe University of Chicago All rights received 0022 >5H08/85/93(15-0005 $01 50 



436 JOURNAL OF POLITICAL ECONOMY 

In faci, such agreements seem not to occur; instead the variation in 
income of individuals increases with years on the job. If individuals 
are indeed risk averse, why is there not more smoothing of income 
across different employees? 

One reason for incomplete insurance is the moral hazard problem. 
If effort is an important but largely unobservable input, pay may be 
based on “results” through incentive programs or profit-sharing 
schemes. However, if the effort is monitorable, or if the connection 
between observable results and individuals’ effort is too weak, such 
schemes will not be used. 

A second explanation of incomplete insurance stems from legal 
prohibitions on indentured servitude. Individuals who are discovered 
to be valuable to the firm will threaten to leave unless their pay is 
increased (Holmstrotn 1981). Other authors have argued that various 
forms of bonding—for example, nonvested pensions and tilted com¬ 
pensation schemes—can effectively prevent agents from quitting 
(Kennan 1979; Ioannides and Pissarides 1980; Lazear 1982). 

This paper develops an alternative explanation for incomplete in¬ 
surance that requires neither moral hazard nor limitations on bond¬ 
ing. In my model I assume that a firm is not directly able to monitor 
the worker’s alternative job opportunities. This limitation seems to be 
a realistic assumption. Suppose an employee receives an offer front 
an outside firm. He may be able to provide documentation of certain 
aspects of the offer to his current employer. However, as we shall see, 
it is in general in the employee’s interest to slant the information he 
provides. The firm making the offer of course has no interest in 
providing the initial firm with confirmation of its terms. Nor is there 
any easy way for the initial employer to determine the employee’s 
private valuation of nonpecuniary benefits. Finally, these informa¬ 
tional problems become even greater if the alternative is self- 
employment or home production. 

It is part of an efficient contract that individuals should move to 
alternate jobs at which they are more productive. However, I will 
show that in situations with workers in possession of private informa¬ 
tion about alternative opportunities, f ull insurance will typically result 
in the worker’s quitting too frequently. Thus the contract will require 
some penalty to be applied to a worker in order to reduce this tempta¬ 
tion. In general the best contract will be a compromise. The worker 
will still quit too frequently and the firm will offer only partial insur¬ 
ance of his income. 

The main testable result of this model is the prediction that for 
higher-productivity individuals the “severance penalty” (the gap be¬ 
tween the wage and the pay upon quitting) is greater in order to 
guarantee that they quit less often. In typical cases this gap will be 
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increased partly by decreases in severance pay and partly by increases 
in wages. Several examples are examined in Section IV. 

There is a wide literature examining the problem of establishing 
optimal contracts under differential information (see Salop and Salop 
1976; Hall and Lilien 1979; Chari 1983; Cooper 1983; Green and 
Kahn 1983; Grossman and Hart 1983). For other studies like the 
present one, in which the private information is the worker’s alterna¬ 
tive job, see Arnott, Hosios, and Stiglitz (1980) and Nalebuff and 
Zeckhauser (1981). My approach is in contrast to that of Mortensen 
(1978), Bronars (1981), and Topel (1983), who treat the informa¬ 
tional inefficiencies in separations as due to the inability of employers 
to monitor the effort expended by employees in searching for new 
jobs. 1 

I. The Model 

Together a firm and a worker can produce a nonnegative random 
variable 0 of output. To do so requires all of the firm’s resources and 
all of the worker’s time. If the worker does not stay with the firm, he 
uses his time in another job, whose benefits are parameterized by v. 
For simplicity I will assume that there is no alternative use of the 
firm’s resources and that the firm is risk neutral. 1 will call 0 on-the-job 
productivity and v outside productivity. The joint distribution of 0 and v is 
known by both agents, but neither knows the realization of either 
variable at the date the contract is signed. 2 For convenience I will take 
the two variables to be independently distributed. I will consider the 
case of correlation briefly in Section VI. 

Let r represent a payment made by the initial firm to the worker. If 
the individual receives the payment but works lor the alternative firm 
his utility will be w(r, x>). (Assume u>, > 0, u\, > 0, and w,, < 0.) If he 
receives rand works for the initial firm his utility will be u(r) = re(r, 0). 

A contract is composed of two elements. First there is a pair of 


1 Two additiondl considerations have also been investigated in the literature I'he 
hrsl is the possibility that in addition to the employment contract, the winker may own a 
portlolio of assets, complicating his choices This possibility is examined bv Topel 
(1983). The second is the possibility of the other party's gaining the information at a 
cost, and the circumstances under which it is desirable to do so. These considerations 
are examined by, among others, Eaton and White (1981) and Guasch and Weiss (1981) 

2 Although much effort in theoretical economics has gone into modeling pioblcms 
where one side or the other has private information at the lime of hire (or on Ixuli 
sides; see Cooper (1982); Green and Honkapohja 1198:1)). it seems in fact that the 
assumption of equal ignorance is the most natural one in the case of new hires Eac h 
party can observe in a) least a c rude way the sueiess rale among other hires in similar 
positions, but it is unlikely that either has inside inlorination on any particular new 
entrant’s prospects. For problems where the information is available at the bargaining 
date, see Myerson (1982). 
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, . sriani; t he payment (possibly negative) 
schedules r t (6, r) and r»( . . 'Hie scheduie rj describes the 

to be made by the hrm to the emp o > ^ ( y )t f irm (] wifi call 

payment to be made it the to be made if 

this on-the-job (ompemation), and > >dest P 

the worker goes elsewhere (1 will call this .severance pay) Hie other 
element of the contract is an indicator variable y(6, v), which equals 
one in those states in which the worker stays with the firm and zero 
otherwise. I here is some redundancy in this description of the con¬ 
tracts, since r t (0, v) is irrelevant for cases where y(0, v) = 0 and t% is 
irrelevant when y = J. We will endure this redundancy for the sake of 
notaiional simplicity. 

When negotiating ex ante, firm and worker will pick from the set of 
ex ante Pareto-optimal contracts. A contiact {rf(-), ?*(•), >'*(•)} is ex 
ante Pareto optimal if it maximizes 

{[.yt/ft i) + (1 - y)it'(r s , v)]k + |v(0 - >,) + (1 - jv)( — r y )|} (1) 


for some positive value of k. The first bracketed expression describes 
ex ante utility; the second describes profits. The choice of k depends 
on the relative power of the two parties to the bargain anti on the 
alternatives available to each ex ante. 


II. First-Best Contract 

It the parties’ actions can be made contingent on the state (0, v) 
verifiable by both sides, then the maximization of (1) is a particularly 
simple variational problem. The first-order conditions are 

u'(rf (0. i’)l = k 1 (2) 

and 

w,[r*(Q, v), T-| = k (3) 

for all (0, t>). Given the optimal r ( and r 2 , the optimal y is that which 
maximizes the sum of profit and utility, weighted by k. That is, 

y*(0, v) = 1 il km + (0 — rf) > kw — r*(v), (d) 

y*(0, i») = 0 if km + (0 - rf) < kw - r*(v). (5) 

From these conditions we can deduce; 

Theorem 1. The f ull information optimal contract has the follow¬ 
ing properties: 

«) Insurance is complete (i.e., the marginal utility of income is con¬ 
stant in all states). Compensation on the job is constant, and compen¬ 
sation to employees who leave does not depend on on-the-job produc¬ 
tivity 0. 
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b) The payment to individuals who leave increases (decreases) with 
the desirability of the outside job if w„, is positive (negative). 

c) Employment is efficient. For any given level of on-the-job pro¬ 
ductivity 0, there is a critical level of outside productivity i>*(0) 
such that the individual leaves if outside productivity is greater than 
v* and remains if outside productivity is less. This critical level in¬ 
creases with 0. 

(Note that the criteria for efficient employment [4] and [5] in general 
vary with k.) 3 

III. The Second-Best Problem 

The contract described above is the ideal arrangement in a world in 
which both employer and employee can verify any outside offer the 
employee receives. However, in the first-best contract, the employee’s 
interests ex post do not coincide with the contracting arrangements. 
The employee’s interest is the maximization of his own utility; the 
contract specifies the maximization of a weighted sum of utility and 
profits. The nature of the divergence is made more precise in the 
following theorem, which also follows immediately from the first- 
order conditions above: 

Theorem 2. if v is a normal good, then in a first-best contract, the 
employee prefers to quit in some states where the contract requires 
him to remain with the firm. (If v is an inferior good, he prefers to 
remain with the firm in some states where the contract requires him to 
leave.) 

Thus if the firm relies on the worker to state v, there is in general an 
incentive for the worker to misrepresent the value of the variable. It is 
reasonable to assume that the employer knows the distribution of 
offers that may occur, but that he treats any announcement of the 
value of v by the employee as manipulated. Under such circum¬ 
stances, we can see that there is never a point to offering a contract 
with a full range of severance pays for varying values of v, since the 
employee will always declare a value of v that yields the most attractive 
compensation. Thus the second-best contract simply offers a pay for 
working and a pay (or penalty) for not working and allows the em¬ 
ployee to choose whichever he will. 

As before we assume that the contract is drawn up in ignorance of 
the values of 0 and v. Then both sides observe 8 and the worker (but 
not the firm) observes v. Given a pair of payments rj for working and 
r 2 for not working it is known that the worker in any instance will 


3 An exception is the case of a utility function with v and r as perfect substitutes (see 
Sec. IV). 
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simply pick the action y that maximizes his utility: 

max (ytttfrr) + (I — y)«t(ra, !')]• (6) 

ve(o,ij 

The worker’s decision serves as a constraint on the joint maximiza¬ 
tion problem. The second-best problem then is to find a contract 
{r|(0), r a (0)} that maximizes (1) subject to the additional constraint that 
yMe), t'l solves (6) for every pair (0, t/). 1 

Let the function r/(r,, n,) be implicitly defined by 

u(r,) = w(r-j, x>). (7) 

In other words, given a pair of payments r\ and r->, the worker quits 
whenever outside opportunities exceed v. Then the ex ante probabil¬ 
ity of a quit given payments r, and r~, is 

QO I, r-z) a 1 - t'.jfjr,, r 2 )], (8) 

where C(-) is the probability distribution of t>. 

With these definitions, we can write the second-best problem as 
follows: ’ 


max /(0, rj, 

where 

J = 0 ~ QK0 - r.) - Qr, + K£ v [//*(r„ i a , u)], 

H* = max //(»| , /->, v, y), (^) 

veto, I I 

// = yn(t'i) + (1 - el¬ 

and Q(ri, r>) is defined in (8). This problem is to be solved for every 
value of 0.*’ We have the following first-order conditions: 

-~(r, - » 2 “ ~ (1 - Q) + KM'(r,)(l - Q) = 0, (10) 


1 More genet ally, wt might wish to allow the possibility ihal the contract may specify 
i antlom-valued mitt tunes. 

’’ It might appeal as it we had <hanged the ground rules trout the first section, since 
we now allow the worket the choice whether or not to quit, while no such option was 
granted in the lull information case. In fact, we could have granted such an option in 
the lull information case as well. The description ol the contract would have been 
slightly different, hut the outcomes in cveiy Mate would have been identical. The 
optimal contract would simply specify penalties lor choosing the “socially incortect" 
option, and these penalties would be sufficiently large that the employee would “volun¬ 
tarily" choose the employment levels we have spedhed. 

" I am grateful to Robert Tojiel for suggesting this formulation. Note that allowing y 
to lie anywhere in the interval [0, 1] (rather than confining it to the endpoints) does not 
change the value of H * since the maximum is always attained at an endpoint 
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- r 2 - 0) - Q + k f w r (r2, v)dG(v) = 0. (1 1) 

dT2 Jp 

In words, r\ is chosen so that the expected increase in profits from a 
marginal decrease in r, offsets the expected loss to the worker, 
weighted by k. The worker’s loss is the third term, the marginal utility 
of income in the case where the worker stays with the firm, times the 
probability that he does so. Profits have two components. The first 
term represents the decreased likelihood of the worker's staying with 
the firm times the profit that accrues if he does. The second term is 
the $1.00 increase in profits from the dollar decrease in r\, multiplied 
by the likelihood of realizing it. The equation for r 2 is similar, except 
for the complication that the worker’s marginal utility of income is not 
necessarily constant at other occupations, but could vary with v. 

For comparison we might consider the case in which there is no 
problem forcing the worker to make the efficient decision, but sever¬ 
ance pay is constrained to be a single value r 2 , no matter w hat v is. In 
such a problem the first-order conditions will be 

«'(».) = k - 1 
Q _l u<,dG(v) = k 1 . 

Again, expected marginal utilities of income are equated. If the first 
terms in (10) and (11) were zero, the conditions would be identical. 
When these terms are nonzero, they represent distortions caused by 
the incentive for inefficient separations. Their signs depend on the 
sign of 0 — r\ + r 2 , the marginal profit to the firm from retaining the 
worker. If this quantity is positive then the second best has the worker 
receiving insufficient income for full insurance when he quits relative 
to the income he receives on the job. 

Theorem 3. If v is distributed independently of 0, then in the 
second-best contract v is nondecreasing in 0. In other words, quits are 
no more frequent at high levels of on-the-job productivity than at low 
levels. 

Coroi.lary. In the second-best contract, at every 0, either d>\/M 2 0 
or dr^/dQ £ 0. In other words, at every 0, either on-the-job compensa¬ 
tion is a nondecreasing function of productivity or severance pay is 
nonincreasing. 

Proof of Theorem 3. Note that J (0, r ( , r 2 ) can lie rewritten as 0[ 1 — 
Q(r 1 , r 2 )j + N(r u r 2 ), where 


N(n> r 2 ) = r,( 1 - Q) - r 2 Q -I- k £„[//*(r,, t 2 , e)J- 
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Let y*(6) — J [0, Fi(0), r 2 (0)], where r](0) and r 2 (0) are second-best 
optimal choices for a given 0. Let Q*(0) and N*(&) be defined simi¬ 
larly. '{'hen 

0[1 ~ £>*(0)] + V*(0) — 0[ 1 — <2*(6)] + iV*(0) 

and 

0[1 - £)*{0)] + A*(0) >811 - Q*(0)] + /V*(0) 

so that (0 - 0)[<2*(0) ~ Q*(0)] s 0, and if 0 > 0, then £J*(0) a £>*(0). 
Thus as 0 increases, the frequency of quits cannot increase. 

In the first-best case, optimum severance pay and optimum on-the- 
job compensation did not depend on the productivity variable 0. In 
the second-best case, however, the optimal contract in general will 
make both wage and severance pay contingent on this variable. The 
following condition is an immediate consequence of conditions (10) 
and (11): 

Theorem 4. Suppose an optimal contract has two levels of 0 such 
that, for each, there is a nonzero probability of staying with the firm 
and a nonzero probability of leaving. Then either compensation or 
severance pay must differ between the two levels. 

In effect, the firm has no direct control over the worker's decision 
to quit, since the worker can always pretend his alternative offers are 
sufficiently attractive. As 0 goes highei, the firm wants to guarantee 
that the worker quits less frequently, and the way to do this is to 
increase the spread between income and severance pay—even at the 
expense of full insurance. 

IV. Examples 

First we will consider the case where u'(r, i>) — u(r + v), so that the 
alternative opportunity is a perfect substitute for income. 

Theorem •>. If income and the outside opportunity are perfect 
substitutes, then in the optimal contract compensation increases and 
severance pay decreases with increases in on-the-job productivity. 
Proof. The first-order conditions become 

— G'(v)(r , - r 2 - 0) - (I ~ Q) + Ku’(r\)( I -0 = 0; (12) 

G'(F')(r, - r 2 — 0) — Q + k ( i/(r 2 + v)dG(v) = 0; (13) 

J r t ~ r 2 

and combining the two we have 

/■.'«'[max(?T, r 2 + r<)J = K ' (14) 

for all 0. The corollary to theorem 3 states that as 0 increases, either r : 
increases or r 2 decreases. In this example both must occur in order for 
the left side of equation (14) to remain constant. 
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Corollary. Insurance is incomplete; the marginal utility of income 
is higher when the worker stays on the job than when he quits. 
Proof. Immediate from (14). 

Theorem 6. If income and the outside job are perfect substitutes, 
then workers remain with their initial firm inefficiently often. 

Proof. From equations (12) and (13) we can deduce that 7 r\ — r 2 - 
0 > 0. For this utility function, in the full information contract v* = 0. 
In other words, the efficiency conditions for production require the 
worker to quit whenever v is greater than 0. But by the definition (7), 
v = r { — r 2 . Thus v* < v, and quits are too rare. 

The other example we consider is the case where the utility func¬ 
tion is separable: w(r, v) = u(r) + v. In examining this example we 
make the additional assumption 

min(i>) >0. (15) 

In other words, at equal levels of pay the current job is always less 
desirable than the alternative (e.g., staying home). 

Theorem 7. If utility is separable then the marginal utility of in¬ 
come is lower for individuals who remain with the firm than for those 
who quit the firm. (Note that this result is the opposite of the previous 
example’s.) 

Proof. From assumption (15) we know that r\ > r 2 . The result 
follows immediately given the form of the utility function. 

Theorem 8. In a second-best contract with a separable utility func¬ 
tion, severance pay dec reases with increases in on-the-job productiv¬ 
ity- 

Proof. By theorem 7 we know that k - u'(r>) 1 > k - u'(ri) ’'. 
Moreover, the signs of these two expressions must be opposite in 
order for the first-order conditions to be satisfied: 

-r;'(,-)(r, - r 2 - 0) + (1 - Q){k - [a'fr,)]- 1 } = 0, (16) 

C'(T-)(r, - r 2 - 0) + Q{k - [u'(r 2 )r‘} = 0. (17) 

Therefore 0 > ri — r 2 . Combining equations (16) and (17) we have 

(1 - 0[u'(r,)r' + (?[«’(r. 2 )j ' 1 = k, (18) 

where 

Q = 1 - - ti(r 2 )]. (19) 

From (18) and (19) we deduce that if (7 increases then so does u(r->). 
Appeal to theorem 3 completes the proof. Note that the corre¬ 
sponding result for u(ri) is indeterminate. 


7 Of course, if the firm in general is making positive profits, then 0 is greaiei than ri 
If so, then r 2 must he negative That is, the worker fmy.\ the firm to leave 
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c "('■,) - «(fa) 


we can cone lude: 

Theorem 9. In a second-best contract quits occur too frequently 
relative to a first-best contract with parameter k*. 

Proof. P = t/(ti) - i<(t 2 ) = «*(>, - r 2 ) < k* 0, where the final 
expression represents the borderline level of productivity in an out¬ 
side job such that a quit is just efficient. 


V. Comparative Statics 

We now consider the ef fects of changes in the distribution of outside 
opportunities. In this examination we will confine our attention to 
separable utility functions. We assume that all the distributions we 
consider are drawn from a set S, a connected open subset of C t , within 
which the second-best optimal contract varies continuously. Given this 
assumption, we can characterize all comparative statics properties by 
examining two apparently special types of changes in the distribution 
function. To see why, it is helpful to return to the first-order condi¬ 
tions (10) and (11). Changes in (!(•) can affect the first-order condi¬ 
tions in two ways, either through Q directly or through the partial 
derivative of (f In either case, the effect depends solely on the behav¬ 
ior of (’>(■) in the vicinity of v. 

If a change in (>(■) does not affect the f unction or its first derivative 
in a neighborhood of f\ then the first-order conditions are un¬ 
changed. Any continuous path in S along which this is the case will he 
called a neutral path. It is then immediate that: 

[.Emma. Along any neutral path the second-best contract is con¬ 
stant. 

Definition. A distribution function is locally uniform at v ij there is an 
open set containing v on which G'( ) is constant. 

H Olhet possible bases tor comparison would be to use the first-best contract yielding 
(I) the same expected marginal utility; (2) the same expected absolute utility; or (3) the 
same value for k in the two problems. [ have been unable to determine unambiguous 
results foi these alternate comparisons. 
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We state without proof: 

Theorem 10. Between any two distributions in S there is a continu¬ 
ous path made up of segments of the following forms: 

1. neutral paths, 

2. paths whose elements are locally uniform distribution functions 
with constant G(v), and 

3. paths whose elements are locally uniform distribution functions 
with constant G'(u). 

Since the optimal contract does not vary along neutral paths, com¬ 
parative statics properties within S can be deduced once we know 
behavior along paths of types 2 and 3. 

Theorem 11. If G is a locally uniform distribution function sub¬ 
jected to a perturbation that leaves G(v) unchanged and increases 
G'(v), then quits decrease. 

Proof. Differentiation of (7) in the case of a separable utility f unc¬ 
tion yields dv = u'(r 2 )dr 2 - M'(n)dri. Totally differentiating (16) and 
(17), 


dr i 


(r, - r 2 - 0)i/'(r,)~ 


= -dG'(v)J 


dr ^ 


( r i - r 2 ~ 0)?/(r 2 ) 


where we assume that y.», the Hessian of /(0, •, ■), is nonsingular. 
Therefore sgn(rfii) = sgn[(r, — r 2 - Q)dG '(!>)], and of course sgn(rfQ) 
= -sgn(dtJ). 

For separable utility functions we know that (rj - r« - 0) is negative 
(see the proof to theorem 8); thus a change that increases G'(v) moves 
v upward. 

To understand this result, recall that for separable functions, quits 
occur too frequently—that is, v is too low compared with the first best. 
Putting greater probability on the borderline case therefore makes it 
profitable to move v in the “efficient” direction. This result general¬ 
izes: local increases in G'(v) will cause v to rise if quits are too frequent 
in the second best; they will cause v to fall if quits are too rare in the 
second best. 

Theorem 12. If G is a locally uniform distribution function sub¬ 
jected to a perturbation that leaves G'(v) unchanged and decreases 
G(v), both ri and r 2 rise. Thus the change in v is ambiguous. However, 
Q rises on net. 

Proof. See Appendix. 

The usefulness of these two results stems from the fact that any 
change in GO) for which the second-best optimal choice responds 
continuously can be decomposed into a series of changes of these two 
forms. For instance, consider a change that shifts the distribution of v 
to the right in the sense of first-order stochastic dominance. Let the 
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density decrease in a range (u,„ i<*), increase in a range ( 1 % ty), and 
remain unchanged elsewhere. From theorem 3 we know that in gen¬ 
eral each 8 is associated with a level of f'[ri(8). r 2 ( 0 )]> where v increases 
with 0. For extreme values of 0, such that v £ {v„, v ,), the contract is 
unchanged. For the 0 such that v = i>/„ the change does not affect 
G'{v)\ therefore for this value of 0 there is an increase in quits. Quits 
also increase for 0 associated with values of v in the region (v,„ 1 /*); 
here the two effects described iri theorems 11 and 12 are reinforcing. 
In the segment ( 1 % v,), the results are ambiguous. 

VI. Correlation between Inside and 
Outside Productivity 

So far 1 have considered the case where there is no correlation at all 
between productivity on this job and productivity elsewhere. In this 
section 1 briefly consider the case where these productivities are cor¬ 
related, so that success in the initial firm provides information as to 
the employee’s opportunities elsewhere. 

If productivity on the job is a pet feet predictor of outside offers, tfie 
information problem disappeats. Suppose that whenever on-the-job 
productivity is 0. productivity in alternate employment is a( 8) with 
ceitainty. Then, from the efficiency formulae, separation should oc¬ 
cur if 


kzc(c/( 0), r-j(0)J — r>(0) > ««>((), r\) 4- 0 — r,, 

where 1 , and /■_> ate chosen to provide optimal insurance. Differentiat¬ 
ing the left side with respect to 0 we see that increases in 0 increase the 
value of separation by K(w,,a' + w,r') — >' — kw„<i'. Since tc(0, r,) does 
not depend on 0, increases in 0 increase the value of continued em¬ 
ployment by one. 

Thus whether increased productivity on the job leads to separations 
or not depends primarily on the magnitude of «'(0). If «’(0) is large, it 
is efficient for the more able worker to leave. If «'(0) is small, it is 
ef ficient for the less able to leave. 

Next I consider the general case of correlation between 0 and v. 
Now G(v), the marginal distribution of v, is a function of 0. However, 
given any value of 0, the first-order conditions (10) and (11) are unaf¬ 
fected. Therefore: 

Theorem 12. The efficiency properties of a second-best contract in 
any state 0 do not depend on whether or not v and 0 are correlated. 

However, comparative statics are complicated considerably. If the 
correlation between 0 and 11 is low we would expect theorem 3 to 
continue to hold. On the other hand, if the correlation were 
sufficiently high, efficiency might better be served by increasing the 
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quit rate with increasing 8. The Appendix investigates this relation¬ 
ship in the case of separable utility functions and uniform distribu¬ 
tions. The result is: 9 

Theorem 13. Suppose the correlation between 8 and v is such that 

dEv = + {(1 - QirtAr,)] 2 }-'^,) 

dO G’(?’)-' + [Qu’(r.,)r'p(r 2 ) + [<1 - Q)u'(r,)] * 'pfnf 

where p(-) is the measure of absolute risk aversion. Then to first order, 
changes in 0 leave the frequency of quits unchanged in the optimal 
contract. If dEv/dQ is greater than this amount, quits increase with 0; if 
less, quits fall. 

Although this expression looks menacing, it is possible to make 
some headway in analyzing it. For example, suppose quits are 
sufficiently rare that £) is approximately zero. Then dEv/dQ = u'(r 2 ). If 
this proportion is rewritten as du/dEv ~ drJdQ, it can be interpreted as 
follows: In a situation in which quits are rare, if the change in average 
outside production increases utility more than the change in inside 
productivity would increase profits, quits should increase. 

More generally, expression (20) is less than a weighted average of 
marginal utilities of income in the contract. Thus if dEv/dQ is greater 
than any observed marginal utility of income, quits increase with in¬ 
creases in 8. For constant absolute risk aversion utility functions, we 
can also find a lower bound to expression (20). Let U be the lowest 
observed marginal utility; then 

dEv < 4p C(v)U 

dQ k + 4pG'(f') 

implies that quits decrease with increases in 0. 

VII, Conclusions 

We have contrasted optimal employment contracts in a world of full 
information and in a world in which the worker’s alternative employ¬ 
ment possibilities cannot be observed by the firm. We have seen that 
when contracts are established to protect a worker from income risk, 
there is a divergence between the situations in which a worker wishes 
to quit and the situations in which it is in the firm's interest for him to 
quit: in general, the worker is willing to quit for “too low” an offer 
elsewhere. 

We have examined second-best contracts, with severance pay (or 
penalties) used as an incentive device to keep employees attached to 

'' The App in fact calculates the effects of a shift in the uniform segment of more 
general distributions. 
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the firm. We have concluded that in the second-best contract, the 
difference between income and severance pay—that is, the opportu¬ 
nity cost of quitting—is higher in states in which the employee is more 
productive. Thus the imperfection leads to imperfect insurance: indi¬ 
viduals’ wages are linked to future productivity even though they 
would prefer a contract with a guaranteed wage. Examples were used 
to demonstrate that second-best contracts can lead either to too few or 
to too frequent quits relative to full information contracts. 

We thus have an explanation for the apparent lack of long-term 
income insurance for employees without resort to anti-indenture 
laws. We also have potentially testable implications: since the imper¬ 
fection in the insurance is ultimately due to the efficiency of having 
individuals move to new jobs in certain circumstances, this lifetime 
insurance will be more nearly perfect in those firms where specific 
human capital guarantees that lifetime tenure is efficient. 

The model I have examined is incomplete in numerous respects. 
The time dimension used is extremely crude: variations in hours 
worked and contracts covering more than 2 periods have not been 
considered. 

The uncertainties considered in this paper may best be regarded as 
the uncertainty an individual faces about his own future opportuni¬ 
ties. I have treated this uncertainty as ultimately due to realized dif¬ 
ferences in individual ability. Under these circumstances, it is rea¬ 
sonable to assume that the firm is risk neutral, since this is the type of 
uncertainty that can reasonably be pooled across new hires in a large 
firm. It also seems reasonable to start by treating this random produc¬ 
tivity as exogenous, as 1 have done, and to assume that eventually 
both firm and employee learn about the employee’s aptitude for work 
within the firm. However, more general forms of uncertainty could 
be usefully incorporated, including information private to the firm or 
private to employees at the outset. 

My approach has examined uncertainty only on the side of labor 
supply. Although over the lifetime of the employee this may be the 
main consideration, in the short run uncertainty on the labor demand 
side may be of greater significance: in particular an individual s pro¬ 
ductivity will vary with the business cycle. Once we allow risks that 
cannot be diversified by the individual firm, a host of new problems 
arise: in particular, we will have to examine the problem explicitly in 
an equilibrium framework with productivities endogenous. 10 It re- 


An initial examination of this particular equilibrium formulation is made in Kahn 
(1984) For studies that examine related equilibrium problems, see Polemarchakis and 
Weiss (1978), Geanakoplos and Ito (1982), Grossman, Hart, and Maskin (1983), and 
Holmstrom (1983). 
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mains a challenge satisfactorily to incorporate the effects of contracts 
with differential information into such an economy. 


Appendix 


If G(-) is a locally uniform distribution at v, then it may be rewritten as 
(*'(•) = (1 - ot)T)(-) + ay(-), 

where y is a uniform distribution on an interval [m - e, m] containing v, and 
the distribution t) equals some constant It on the same interval. Then, letting 8 
= ae~\ 


Q = I — G(v ) = (1 — a)(l — k) + b(rn — fi). (A 1) 


Let r(u) be the inverse of the utility function. Then problem (11) can be 
rewritten as 

max |(1 - Q)[0 - r(«|) + ku,] + (>[ - r(u-j) + Kt/ 2 ] 

CJ.Ul.U'J ( 


+ k( 1 — a) |_ vdr\(v) + '/ikS (m 1 — v~) 
subject to (A I) and to 


V = U, - My. 


(A2) 


Eliminating (A2) and letting p be the multiplier associated with (A 1), the first- 
order conditions become: 


— [0 - r(tii) — Kill) 4- [~r(u 2 ) + Kilo) + p = 0, (A3) 

(1 — (£)(-rj + k) - k 8(H| - My) + 8p = 0, (A4) 

Q_(-T '2 + k) + k 8(U| - tty) — 8p = 0. (Aii) 


For various values of K ) it is possible to have interior or boundary maxima 
to this problem. Assume that we are at an interior maximum. Totally differ¬ 
entiating the system (Al), (A3)-(A5) and regarding endogenous variables as 
functions of m and 0 yields the following: 


~ 0 1 

8 -8 


V 


bdtn 

1 0 

r| - k -ry + k 




dQ 

8 rj — k 

— (1 — Q)r” — kS k8 


dill 


0 

— 8 ry k 

k8 - Qr'j — k8 


du 2 


0 


Call the matrix on the left H and assume that it is nonsingular. Because ol the 
second-order conditions for a maximum. |f/| is nonpositive. Thus 

-|//| '8[(>r2(l - QK + 8r \Qr '-2 + 6»i(l - Q)t','l>0; 



du\ 

dm 


l«l ‘Sf-QraV! - k) - 8ry(r,' - ry)] > 0; 


dux 


\H\ ’SfSrKr,' - ry) - (1 - Q)rj(-»i + k)] a 0; 


* 


dm 
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dv dtt\ duj .... 

—-— = —--—-— is of ambiguous sign; 

dm dm dm 

= |//| 's'ick + a - <2Kj < o. 

oo 

These inequalities prove theorem 12 in the text. 

A second way <»l interpreting movements in m is as movements in Ev. We 
know that (lExdihn = a. in words, a shift of the uniform segment to the right 
by one unit lot reases the expet tation of z> by an amount equal to the weight of 
the uniform distribution in the whole of G(-) (e.g., if the whole distribution is 
uniionn, a is I). Thus if Ev and 0 are to be mt leased in such a way that Q 
remains constant, then 

dEv _ - dQ/dQ dEi’ _ (1 — (fhi + Qt'i dEv 

rfe d(tldm aw 5 '(1 - Q> 0 >’!>", + )V|’( 1 - Q) + t [rSQ Am 

which, when a equals one, is equivalent lo fonmila (20) in the text. The 
subsequent formulas of Section VI may be derived by substituting the lirst- 
oidei (onditions back into this expression. 
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Testing the Efficiency of Extraction 
from a Stock Resource 


Scott Farrow 

Cartirgir-Mrllon / huvn u/y 


The theoretical conditions for efficient extraction from a known 
stock resource by a competitive mining firm are reviewed. These 
conditions are tested using proprietary data from a mining firm. 
Output price data and coefficient estimates from a translog cost 
system are used to compute the in situ value of the resource and the 
stock effect. Changes in the in situ value over time are then statisti¬ 
cally compared with the expected price path. The results reject the 
hypothesis that the data are consistent with the theoretical model 
and the maintained hypotheses. Variations of the basic model that 
tncoi [rotate a time-varying discount rate, an alternative expected 
[true series, and a constraint on the rate of output are also tested and 
t ejected. 


I. Introduction 

The purpose of this paper is to test for consistency between the 
theoretical, privately efficient extraction path and the actual extrac¬ 
tion path of an individual mining firm. The economic issue is whether 
the basic economic model for addressing this problem describes the 
actions of the firm. This issue is of interest to economists and policy 
analysts who seek an accurate description of how extraction decisions 


[ wish to thank Gregory M. Duncan, Walter Butcher, Frederick I naba, Jeffrey Kraut- 
kracmer, Steve Garber, and V Kerry Smith tor insightful comments on this paper. 
Empirical detail was possible only through Lhe generosity of a large mining firm that 
has chosen to remain anonymous. Support was provided by Carnegie-Mellon Univer¬ 
sity, the Resources loi the Future Small Grants Program, and Washington State 
University. 
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are made. If an accurate description of actual behavior exists, then a 
rational basis exists for comparing current resource policy with alter¬ 
native policies. If the economic model is not consistent with the data, 
then research is indicated to construct a model that does describe the 
actual behavior of a firm facing this problem. It is worthwhile to 
emphasize that the empirical model tested here is a joint test of 
efficiency and of the maintained hypotheses used to construct the 
empirical model of the firm. 

It is also possible to interpret the economic model of extraction as a 
description of how the extraction decision should be made. For in¬ 
stance, divergence from the model may lead the firm to alter the 
extraction decision as soon as it realizes its “error.” Also, policy ana¬ 
lysts or advocates may recommend taxes, subsidies, or regulations to 
force firms to behave as the model specifies. 

The basic economic model of the mining problem assumes that the 
objective of the firm is to maximize the present value of profits from 
extracting a known stock of ore. Hotelling (1931) showed that 
efficiency requires the rate of output to be chosen such that the mar¬ 
ginal net benefit of extracting ore today is equated with the dis¬ 
counted marginal net benefit of extracting it in the next time period, 
the opportunity cost of current extraction. Hotelling was also able to 
show that in the absence of externalities the maximum discounted 
total surplus could be achieved by a perfectly competitive mining 
industry. 

Researchers since Hotelling’s day have added several theoretical 
complications, but it remains an open question whether the Hotelling 
model is an empirically valid description of the extraction path of a 
mining firm. This issue is not settled because data have not been 
available to researchers at the analytical level that Hotelling modeled. 
A unique part of this paper is a data set supplied by a mining firm. 
The detail of this data set allows rigorous tests of the empirical validity 
of the theoretical model. 

The data are supplied by a mining firm in an underground, hard 
rock mining industry. Because the company has asked to remain 
anonymous, many details cannot be made explicit. However, the gen¬ 
eral characteristics of the mine are that several metals are extracted 
from ore that is mined from a relatively deep source. The output and 
input markets are competitive. The data series is monthly for the time 
period from January 1975 to December 1981 inclusive. The data set 
includes information on input and output prices, the rate of output, 
depth of production, and other relevant data. During the sample 
period there were no technological changes in the industry or particu¬ 
larly large discoveries of new deposits that would have altered the 
characteristics of the market. Further detail regarding the data is 
presented in Appendix A and discussed in the text. 
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The Hotelling model is tested for consistency with the data in the 
following way. First, a standard version of the Hotelling model that 
includes the effect of both current and cumulative output on costs is 
reviewed to derive necessary conditions for present-value profit max¬ 
imization. This model and previous tests of the Hotelling model are 
described in Section II. Second, the conditions for optimization are 
restated in an empirically testable form. Three empirical variations of 
the basic model that incorporate a time-varying discount rate, ex¬ 
pected prices, and constrained output are also developed. The latter 
model was developed hecause of unusual empirical results using the 
basic model and discussions of rated capacity in the mining engineer¬ 
ing literature. These models are presented in Section 111. Transfor¬ 
mations of the original data that are necessary to estimate and test the 
model are discussed in Section IV 7 . The results of testing the Hotelling 
model are presented in Section V. The joint set of the hypotheses 
implied by the Hotelling model and the maintained hypotheses art- 
refuted by these results. The estimate of a coefficient that is expected 
to reveal the firm’s discount rate is significantly negative in each of the 
versions of the test. The results are. however, informally consistent 
with the rule of thumb employed by mining firms to determine the 
lowest quality of ore that is extracted. In times of rising prices the 
firms lower the cutoff grade, the lowest quality of ore mined. 
The fitm does not necessarily increase the quantity of ore processed 
sufficiently to cause an increase in pure metal output. Empirically, an 
observer would note that the extraction of the marginal unit of pure 
metal is delayed horn the current time period to a lower-priced lime 
period when it is finally extracted. This behavior may be the cause of a 
negative value for the coefficient that is expected to he equal to the 
firm’s discount rate. However, there is no formal model linking the 
complications introduced by mining practice to the empirical esti¬ 
mates obtained to test the Hotelling model. Further directions for 
research are indicated in the final section. Last, descriptions of the 
data and the detailed regression results are presented in the appen¬ 
dices. 

II. The Theory of Resource Extraction 

In this section the basic model for the optimal rate of extraction of a 
nonrenewable resource is developed for a firm extracting from a 
known resource stock. Empirical studies relating to this theory are 
also discussed. 

A. The Hotelling Model 

The basic efficiency conditions result from maximizing the present 
value of profits when the stock of ore is know n and the firm is a price 
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taker in both the input and the output markets. Extraction costs are 
modeled as a function of both the rate of output and cumulative 
extraction. Numerous statements of this problem have appeared in 
the literature since Hotelling’s seminal article in 1931. Recent rigor¬ 
ous statements include those by Weinstein and Zeckhauser (1975), 
Dasgupta and Heal (1979), and Fisher (1981). 

The present-value profit function of the firm is 

fV fi, {P(otf« - w«)]R 

Jo 

where 

P(t) = exogenous resource commodity price, 

8 = firm’s rate of discount, 

R(t) = rale of extraction, 

X(t) = stock remaining in the mine, 

W(t) = vector of exogenous input prices, and 
T = terminal time period. 

The control variable over which a competitive firm maximizes dis¬ 
counted profits is the rate of extraction, R(l). Because of competition 
in all markets, the output price, P(t), and the vector of input prices, 
W(t), are not a function of the firm’s output. All prices are assumed to 
be known to the firm for the entire planning period so that extraction 
can be completely planned at time 0. 

Modeling the cost function involves a change from static economic 
modeling. The rate of output of the firm enters the cost function in 
the usual way. However, costs are also modeled to be a function of 
cumulative extraction. This effect is modeled in the theoretical litera¬ 
ture by including the stock remaining in the mine as an argument in 
the cost function. There are two distinct justifications for this factor. 
One justification is that cumulative extraction results in mining lower- 
quality ores (Herfindahl and Kneese 1974). Mining lower-grade ores 
drives up the costs of producing a unit of refined output as more ore 
must be processed. Empirical studies that interpret the stock effect in 
this manner include those by Zimmerman (1977) and Slade (1980). 

A second justification of the stock effect is that ore from a particu¬ 
lar mine is extracted from greater depth or greater horizontal dis¬ 
tances from the shaft as cumulative extraction increases. This inter¬ 
pretation of the stock effect has been mentioned in the theoretical 
literature as far back as the writings of John Stuart Mill and has been 
explicitly recognized by Hotelling (1931), Barnett and Morse (1962), 
and Fisher (1979). 

The appropriate model of the stock effect varies from mine to 
mine. The geology of some mines is such that the vein increases in 
quality at a greater depth; others exhibit the opposite pattern. In the 
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mine being studied there is no obvious progression of declining 
grades during the sample period. Therefore the stock effect is as¬ 
sumed to be due to mining a homogeneous block of ore at a greater 
depth. 

The necessary conditions for efficiency are obtained by solving the 
formal discounted profit-maximizing problem of the competitive 
mining firm using the method of optimal control. The complete state¬ 
ment of the problem includes a constraint on the resource stock avail¬ 
able at any time / and is stated as: 

max f e-*'{P(t)R(t) - C[R(t), X(l), W(/)]Wf 

s.t. X(l) = B - j' R(s)ds, 

R(l), A'(/), P(t), W(/) & 0, 

T unconstrained; 

where B - known resource stock and other variables are as defined 
previously. Both the number of mining locations (faces) and the out¬ 
put per location in a mine can be altered. Therefore the function is 
assumed to be differentiable. Two properties of the cost function, 
positive marginal cost of output and a negative marginal cost with 
respect to stock in the ground, will be used in the empirical investiga¬ 
tion. The latter marginal cost assumes that a marginal unit of ore can 
be exit acted from a shallower depth next period than if the current 
unit had already been extracted. This term represents the stock ef¬ 
fect. 

The necessary conditions for optimization are 1 

m(t) = /'(/) - <:„(/), (1) 

m(t) - 5 m(t) + (]\(t), (2) 

X(T)m{T) = 0, m(() 5= 0, (3) 

\R(T)R(T) - C\R(T), X(T), W(7)]} - m(T)R(T) = 0, (4) 

and 


R(t). X(t), R(l), W(() 3* 0. 

The original constraint, X(t) = -R{1 ), is also a necessary condition. 
Letter subscripts on functions indicate partial derivatives. 

In the case to be studied, the inequality constraints in the last line 


1 Sufficient conditions (or optimization require joint concavity of the integrand in 
R and X. The test for consistency Iretween the model and the data focuses on the 
necessaty conditions. Therefore the second-order properties are not discussed below. 
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are met by inspection of the data. Equations (3) and (4), the transver- 
sality conditions, apply to the final time period. The terminal time 
period is not observed in the data set. Therefore equations (3) and (4) 
do not form an explicit part of the test for consistency between the 
model and the data. However, these equations are integral compo¬ 
nents of the theoretical and empirical problem, as the following dis¬ 
cussion of equations (1) and (2) makes clear. 

Equations (1) and (2) can be interpreted as static and dynamic con¬ 
ditions for optimization. These equations form the basis of the test for 
consistency between the model and the data. Equation (1), the static 
efficiency condition, states that at any time t there exists a wedge, m(t), 
between the commodity price and the marginal cost of extraction due 
to the scarcity of the resource. This value, also called the scarcity rent, 
reflects the basic postulate that any scarce good commands a positive 
price. In addition, the variable m(l) has the usual interpretation of a 
shadow price; it is the change in the objective function from a unit 
increment in the constraint. Thus it is the marginal value of a unit of 
the resource left in the mine. The larger is the supply of the resource 
relative to demand over the entire time horizon, the smaller will be 
m(t), and the resource is less scarce. When m(t) is in some sense small 
in the original time period, then it will remain small until sufficient 
time has elapsed that exhaustion occurs in the near future. 2 The 
variable m(t) is not directly observed in the data set. However, esti¬ 
mates of P(t) and C R (t) can be computed. The static condition for 
efficiency represented by equation (1) is then used as an identity to 
compute m(t). 

Equation (2) is the dynamic efficiency equation and the focus of the 
test of the intertemporal efficiency of a firm. This equation states that 
the change in the scarcity rent is equal to the sum of two factors. The 
first factor, 8 m(t), reflects the external opportunity cost of holding a 
unit of the resource in the ground. It is the value of the forgone 
interest that could have been earned by extracting the unit last period 
and investing the return in an alternative investment. The second 
term, C x , represents the stock effect that is the internal opportunity 
cost of extraction. Noting that Cx < 0, this term indicates that the 
reduction in future costs from leaving the resource in the ground 
partially offsets the external opportunity cost. Therefore, scarcity 


2 In the theoretical example developed by Dasgupla and Heal (15)79. p. 172), m(l) will 
be dose to zero until approximately 40 years before exhaustion. The metals studied in 
this case satisfy that condition based on dividing known leserves by annual consump¬ 
tion (Meadows et al. 1972). This does not imply agreement with the hypothesis dial the 
metals being studied will be exhausted within 40 years. The “years beloie exhaustion" 
is used to establish the materials that are most likely to have a value ol »i(f) that is 
statistically observable. 
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rents will grow at a rate slower than the rate of interest or may even 
decay over some time period if the stock effect is sufficiently large. 
The extent of decay that is possible is limited by the transversality 
conditions. U m(t) decayed to the point where it equaled zero, extrac¬ 
tion would stop (see eq. [3]). This case would indicate that decreased 
accessibility had caused the in situ resource to he valueless. 3 

B. Previous Tests of Dynamic Efficiency 

Previous lests of dynamic efficiency by a resource-extracting industry 
focus on the relationship in equation (2), the dynamic efficiency con¬ 
dition. A direct test of the dynamic efficiency condition is one that 
obtains a value for the scarcity rent and determines the quantitative 
relationship between its price movement, the interest rate, and a stock 
effect. I'he barrier to such a test is the lack of data. Scarcity rents are 
not observable, and cost information is proprietary information of the 
firm. Royally contracts, which exist when one company owns the re¬ 
source and another firm is the operator, could provide some indica¬ 
tion, though these data are also not readily available. 

Because of these data problems, there have been few lests of dy¬ 
namic efficiency. The general approach has been to use data on a 
refined commodity, pure copper for instance, because these industry 
price data are readily available. I’he test then seeks a statistical rela¬ 
tionship between the change in the commodity price and the rate of 
interest. If extraction is costly, C H > 0, and the stock ef fect is negligi¬ 
ble, a given percentage change in the in situ price will cause a less than 
proportional increase in the commodity price. 

The earliest test of the possible implications of the dynamic neces¬ 
sary condition is credited to Barnett and Morse (1962). Their purpose 
was to test the hypothesis of increasing economic scarcity of natural 
resources. They argued that this could be revealed in two ways. One 
indication of increasing economic scarcity would be real resource 
prices that increase over time. If the optimization problem accurately 
represents the behavior of firms, then the real price of minerals 
would be increasing if the stock ef fect is smaller in absolute value than 
the increase in the scarcity rent, ceteris paribus. Barnett and Morse 


' fhe slock effect nuy be interpreted in an equivalent bin slightly ilillerent perspec¬ 
tive. If one returns to an analysis ot the model in present value terms, instead of rut rent 
value, recall that in the absence of a stock effect the present value of the in situ resource 
is the same for all periods. When the stock effect is included it can be shown (Solow and 
Wan 197fi) (hat the present value of the tn situ lesourcc changes. The total change m 
this value from time zero to the terminal time period is equivalent (o the discounted 
cumulative value of the stock effect. The thange in current value terms of the resource 
thus includes the current term of the cumulated stock effect. 
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rejected the hypothesis of an upward trend to real prices. Barnett 
(1979, p. 164) concluded that “the principal reasons for this [denying 
the doctrine of increasing economic scarcity] were: (1) substitutions of 
economically more plentiful resources for less plentiful ones; (2) in¬ 
creased discoveries and availability of domestic mineral resources; (3) 
increased imports of selected metallic minerals; and (4) a marked 
increase in the acquisition of knowledge and sociotechnical im¬ 
provements.” 

However, Barnett and Morse’s rejection of increasing economic 
scarcity does not reject the hypothesis that mining firms behave 
efficiently, because the very factors that led to rejecting economic 
scarcity must be controlled for in order to test efficiency. 

Barnett and Morse also tested for increasing economic scarcity by 
measuring whether the real unit cost of supplying natural resources 
had been increasing over time as measured by a unit index of factor 
productivity (Smith 1980). One interpretation of this measure is that 
it is testing for an industrywide stock effect though allowing technol¬ 
ogy to change. The results contradicted the hypothesis of increasing 
real unit costs, even on a more disaggregated basis than was applied to 
the first test. Barnett and Morse concluded that technological change 
and other factors that decreased inputs must have dominated any 
output or stock effects during the time period. This result provided 
further evidence for refuting the hypothesis of increasing economic- 
scarcity. However, since Barnett and Morse analyzed a slightly differ¬ 
ent question their study is unable to provide a rigorous test of the 
consistency of the Hotelling model with the behavior of the firm. 

More recent investigations have sought to test the dynamic behav¬ 
ior of the mineral industry using a model of arbitrage behavior for 
owners of capital assets. Heal and Barrow (1980) attempt to test a 
model in which market interest rates reflect the alternative earnings 
that could be earned by a commodity trader. Using a time-series 
model of lags of interest rates and income, they reject the hypothesis 
that the level of the interest rates is a significant explanatory factor in 
their model of mineral commodity prices. 

However, Heal and Barrow found that the change in an interest 
rate, as opposed to its level, is a significant variable. 1 It is clear that the 
authors are uneasy with their conclusion, as they stale that “the most 
unsatisfactory part of our model is the implication that if interest rates 
are constant, then the rate of change of resource prices must be zero, 
a conclusion which contradicts the simple asset market equilibrium 
arguments” (Heal and Barrow 1980, p. 175). 


1 If scarcity rent is in fad statistically negligible, this result is reasonable since the firm 
still incurs standard capital costs whose price is the rate of interest (Slade 1980). 
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Smith (1979) also studied several mineral commodities and fuels 
using Heal and Barrow’s model. The results were compared with a 
simple Hotelling model 5 and a time-series model. He loo warned of 
the dangers of using commodity level prices (with reference to eq. 
[]]): “If we do not observe the marginal extraction costs, Cn , it be¬ 
comes difficult to predict a priori how closely the movements in prices 
will be to the interest rate” (Smith 1979, pp. 1—5). Smith argued that 
Heal and Barrow’s model is properly interpreted as a reduced-form 
model that is compatible with several different structural models of 
resource extraction. This reflects the several alternative models 
enumerated by Heal and Barrow. Smith’s results, like those of Heal 
and Barrow, provide only limited support for the hypothesis that 
resource markets follow arbitrage behavior. 

Recent theoretical results provide more substantive arguments why 
the commodity price path may differ significantly from the in situ 
jirice path. Levhari and Pindyck (1981) analyze the case of a durable 
exhaustible resource. Because the commodity is durable, the stock of 
the commodity that is available accumulates over time. This accumu¬ 
lation lowers the commodity price path from the path implied for a 
nondurable exhaustible resource. However, the optimum price path 
of the in situ resource is defined by the Hotelling formulation. 

Pindyck (1982) considers a second case of jointly produced re¬ 
sources. The model concludes that the jointness of supply may lead to 
a different price path for the commodity than is expected for the in 
situ resource. However, the Hotelling rule still applies to the price 
path of the in situ resource. 

A final theoretical approach to the determination of scarcity rents 
has been developed by Pindyck (1978) and Devarajan and Fisher 
(1982). These authors include exploration activity in the model of the 
firm. Under c ertain conditions, they show that scarcity rent is equiva¬ 
lent to marginal discovery costs. The equivalence breaks down in both 
models, in a world without uncertainty, if there exists a stork effect in 
the exploration process. The stock effect of exploration occurs if 
cumulative exploration increases the marginal costs of discovery. In a 
model incorporating uncertainty, Devarajan and Fisher show that 
marginal discovery costs can provide an upper bound on the value of 
scarcity rent. This property results for a risk-neutral firm if C X x < 0. 
The empirical determination of C X x can thus indicate the fruitfulness 
of alternative methods of determining scarcity rent. The ambiguity of 
emjjirical results using commodity prices and the increasing theoret¬ 
ical distinction between resource commodities and in situ resources 
underscores the usefulness of analyzing in situ prices. 

5 C H = c x = 0 
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The research most closely linked to a direct test of the Hotelling 
model is by Stollery (1983), who attempts to analyze the movement of 
the in situ price for a mining firm that is a price leader in its industry. 
After making suitable modifications for his case study, he does not 
reject the hypothesis of efficient behavior based on the estimated 
value of 8 for his data. His procedures and data are a substantial step 
forward but are not appropriate for the industry at hand. For in¬ 
stance, Stollery explains that 19 price changes occurred in the indus¬ 
try in 28 years. This price behavior reflects a substantially dilferent 
structure from the competitive market analyzed in this paper. The 
imperfectly competitive structure of Stollery’s case complicates the 
interpretation of any variable computed to equal the difference be¬ 
tween marginal revenue and marginal cost. The difference may be 
called the in situ value; it may also represent the profit margin that 
may change due to some dynamic limit pricing policy. Finally, Stollery 
accepts, after careful testing, a production function that is incompat¬ 
ible with the data in the current case. These differences in market 
structure, production technology, and data indicate that further re¬ 
search is necessary to understand fully the actions of a mining firm. 

III. Three Empirical Models 
of Dynamic Efficiency 

A. The Basic Model 

The empirical test of efficiency rests on consistency between the data 
and the necessary conditions set out in equations (1) and (2). How¬ 
ever, m(t), the shadow price of the resource in the ground, is unob¬ 
servable. Consequently the static condition is imposed to identify m(t). 
This biases the test toward a finding of efficient behavior as both 
conditions are necessary for efficiency. A second problem is caused by 
the discrete-time data provided by the firm. The continuous-time 
specification of equation (2) must be changed to correspond to the 
discrete data. It can be shown (Fisher 1981) that the continuous-lime 
model developed here is the limiting case of a discrete-time model. In 
particular, the dynamic efficiency condition in discrete time takes the 
following form: 

Am(<) = 8 m(t — 1) -f C x (t). (2') 

Equation (2') is the empirical form that will be used to test dynamic 
efficiency. 

The test for efficiency then rests on a statistical lest of the following 
parameters of the dynamic efficiency condition: (1)8 = the firm's rate 
of discount for a unit time period; and (2) pi = 1, where Pi is the 
implicit coefficient on the stock effect in equation (2'). 
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If these hypotheses are not rejected then the conclusion is that the 
Hotelling model is consistent with the data. This model will also be 
referred to as the basic model. 

Determining the rate of discount is often a source of contention 
among economists. In this case, however, the nominal ex ante dis¬ 
count rate that the firm used to discount after-tax cash flows during 
the sample period was between 15 and 20 percent on an annual basis. 
As the data used in this study are monthly, the monthly rate would be 
in the range of 1.2— 1.5 percent. However, as the estimate of 5 is based 
on realizations of the firm’s actions, that is, ex post, the estimated 
value may differ from this range and still be regarded as acceptable 
depending on each analyst’s point of view. Therefore the hypothesis 
test for the value of 8 will be presented using confidence intervals 
instead of a point value of the null hypothesis. 

The assumption of a constant rate of discount may prove to be an 
inappropriate abstraction. Previous studies of the rate of change of 
commodity prices have typically assumed that the rate of discount 
varies with some appropriate rate of interest. The first variation of the 
basic model incorporates this concern by allowing the rate of discount 
to equal the actual borrowing rate of the firm. The only change that 
occurs in the necessary conditions is that 8 is replaced by 8(f). 0 


Basic Conditions of the Industry 

Numerous taxes and subsidies are a part of the actual extraction 
problem of the firm. In addition to federal corporate income taxes, 
mining firms are of ten subject to severance taxes and additional state 
and local income taxes. Alternatively, a mining firm’s tax burden is 
eased by depletion allowances and current expensing of some capital 
items. 

In this study the federal income tax and other taxes levied on the 
firm are assumed to fall entirely on economic profits. The effect of 
the tax on equation (1) is to change the level of m(t) by a factor k = 

(1 — tax rate). The ef fect on the dynamic condition is to change both 
sides of the equation by the same factor, k. Estimating the efficiency 
conditions as defined in the basic model amounts to identifying the 
factor k and dividing through by it in order to identify 8 and (J|. The 
profit-maximizing output is unaltered. Therefore tax rates are ex¬ 
cluded from the estimation procedure and all variables are computed 
net of taxes. 


'■ II can t>e shown that this revised necessary condition is obtained by replacing the 
discount rate factor in time t, - 8 1. by —p(t) where p(t) - /!, 8(s )<h. The constant rate of 
discount is a nested version of p(<) when 8<s) = 8. 
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Two new laws were passed during the sample period that could 
have affected an optimal extraction path. The first, the promulgation 
of the final EPA regulations on effluent from mining and milling and 
ambient lead content in the air, was issued in 1978. The second, the 
Comprehensive Environmental Response, Compensation, and Liabil¬ 
ity Act of 1980 (the Superfund), imposed fixed taxes per ton of 
effluents for a set period of time. 

The firm expected the direct cost of EPA regulations to be minimal, 
and so the regulation would not be expected to alter the extraction 
path. However, the Superfund tax, which under standard smelter 
contracts would be passed along in part to the mine operator, distorts 
the theoretically optimal extraction path as it is in constant nominal 
terms. This distortion is due to the tax’s having a smaller effect on 
f uture profits because of discounting. Mine operators would theoreti¬ 
cally shift production to the future when the discounted effect of the 
tax on profits is smaller. However, this tax is assumed to be an 
insignificant determinant of the extraction decision, because the tax 
was not imposed until late in the sample period and the firm was 
operating under a preexisting contract for its output. 

In addition to the institutional structure of the industry, the basic 
conditions for a mineral industry are partially defined by the many 
types of uncertainty affecting these markets. Economic literature on 
this subject includes treatments of uncertain demand (Weinstein and 
Zeckhauser 1975), uncertain resource stocks (Kemp 1976; Loury 
1978; Gilbert 1979), and the uncertain introduction of a backstop 
technology (Dasgupta and Heal 1979; Dasgupta and Stiglitz 1981). 
For the case at. hand, uncertain demand could have resulted from 
speculative commodity purchases in the late 1970s or from the uncer¬ 
tain timing of EPA regulations on the use of lead as a gasoline addi¬ 
tive. In addition, the exact extent of a mineral stock is seldom known, 
and possible substitute commodities and processes are frequent 
sources of discussion in the Minerah Yearbook (U.S. Department of the 
Interior, Bureau of Mines 1975-81) or in the popular press. How¬ 
ever, little can be said about the impact of uncertainty on the extrac¬ 
tion path of the firm in the absence of a compelling reason to believe 
that one type of uncertainty dominated the others. This ambiguous 
result is due to extraction being shifted toward the present or the 
future depending on the cause of the uncertainty. The ef fect of un¬ 
certainty is often transformed into an increase or decrease in the rate 
of discount because this change speeds up or slows down the rate of 
extraction. Therefore the ambiguity due to uncertainty may increase 
or decrease 8 from its value in a world of certainty. It is assumed in 
this study that any adjustment for uncertainty is already incorporated 
into the firm’s rate of discount that is provided as datum. 
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B. The Expected-Price Model 

The theoretical model assumes that the output price is known for the 
entire planning period at the time the original extraction path is 
determined. However, complete futures markets do not exist for this 
product. The output price data available in the case study are the 
prices the mine operator received from the smelter for partially 
refined ore in the form of concentrates. 7 These data reflect the actual 
price that the mine operator receives in each time period. 

One method of estimating expected future price is to assume that 
the price that actually occurred in time t was the expected price when 
the extraction path was determined. This is the empirical approach 
taken in the test of the basic model; the actual price is used in the 
static condition to solve for m(t). 

An alternative formulation of price models the expected price as if 
the mine operator could exactly forecast a weighted average of the 
prices from the prices in the preceding 6 months. The effect of this 
formula is to reduce the month-to-month fluctuation in the expected- 
price series. Specifically, expected values are formed based on the 
following equation; 

<> 

P(t) = a„ + ^ a,P(t - 1 ). (5) 

1 = 1 

The a, coefficients are obtained from an OI.S regression. N 

The results of estimating the basic model with expected prices 
formed in this fashion are listed in the later sections as the expected- 
price model. This model allows the extraction path based on a 
smoothed price series to differ from the path based on perfect knowl¬ 
edge of future prices. No claim is made that this specification of 
expected prices is an exact representation of the formulation of ex¬ 
pectations. The real expectations process remains unknown. The ex¬ 
pected-price model is properly interpreted as a test of the robustness 
of the estimated parameters when an important variable in the deci¬ 
sion process, the output price, differs from its values in the basic 
model. 


C. The Constrained Model 

It may be that the firm is actually constrained from maximizing profit 
as the basic model assumes. Instead, in each period of time, produc- 

7 The process of concentration results in the mine operator's shipping a product that 
has a higher proportion of metal than the raw ore The firm receives a proportion of 
the market price for the pure commodity, subject to cost deductions by the smelter. See 
Kilgore et al. (1983) for the conditions of a typical smelter contract. 

H The detailed results are presented in App. B. 
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don may be constrained to be less than or equal to the maximum 
capacity of a fixed factor, say the hoisting facility. This viewpoint, 
while not unanimously held at the company headquarters, reflects 
discussions in the mining engineering literature on the rated capacity 
of a mining operation. It is reasonable to assume that the average cost 
of a hoisting facility will be close to the minimum value when operated 
at its rated capacity. As a result observed operations might yield a 
censored sample where no observations are recorded beyond the 
rated capacity of the mine. 

If there is a capacity constraint then the theoretical and economet¬ 
ric models must be respecified to include this factor. One may do this 
in the theoretical model by assuming that there is an upper bound on 
the value of the control variable, the rate of pure metal output. It can 
be shown (Kamien and Schwartz 1981) that with a constrained rate of 
output the static efficiency condition must be rewritten. Constrained 
optimization of the Hamiltonian yields: 

rn(l) = P(t) - C R (t) - \(t). (T) 

The multiplier MO is the shadow price of the capacity constraint. Its 
value is zero when capacity is not binding and greater than zero when 
it is binding. The dynamic condition for efficiency remains the same 
as equation (2). 

If the constrained model is the appropriate statement of the prob¬ 
lem then the basic and the expected-price models mismeasure m(t) in 
the 26 observations when the constraint is binding.'' The model to be 
developed in the following paragraphs explicitly incorporates this 
mismeasurement into the equation that is estimated. 

The relation between equation (1) and equation (I') can be shown 
by redefining the variable used for m(t) in the previous models as 

A(t) = P(t) - C«(0 = m(t) + mom, (6) 

01(0 = 1 if R* /0" ax ; 0 otherwise. 

The theoretical value of \(t) is now split into two components, MO and 
01(0- The dummy variable 01(0 represents the decision of the firm 
whether or not to produce at the output where the constraint is bind¬ 
ing. The continuous variable MO represents the shadow price if the 
constraint were binding in time period t. 

Manipulation of this equation yields an empirical model that pro¬ 
vides a test of constrained dynamic efficiency. The first difference of 
equation (6) is AA(0 = Am(0 + MOO 1(0 — k(t ~ I )01 (< - 1). Sub- 


9 In tact, some of these observations exceed the published capacity. The question is 
whether the unconstrained or the constrained models are better abstractions of the 
actual problem. 
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shutting equation ( 2 ') for the first term on the right-hand side yields: 

AA(t) = hm(t - 1) -I- p,C v (0 + \(t)Dl(t) - \(< - l)Dl(l - 1). 

The estimation of this equation would still require a value for m(t — 
1), which is mismeasured in this model by A(t - 1). Consider the 
lagged value of equation ( 6 ). Multiply each side of this equation by 8 
and solve for 8 m(/ - 1). Substituting this result lor the first term on 
the right-hand side of the previous equation and collecting terms 
yields: 

\A(l) = M(< - 1) + P|C Y (<) + \(f)D 1(0 - (1 + 8 )k(i - 1)D1(< - 1). 

This equation was empirically estimated in the following form: 

AA(t) = 8 A(t - 1) + P,C V (/) + p a Dl(0 + P;,Dl(t - 1). (2") 

The values of \(t) and X(< — 1) are theoretically dependent on both 
input and output prices as they represent the shadow price of chang¬ 
ing the constraint by a marginal unit. However, this dummy variable 
formulation forces these values to be constant. In effect P-_> and (T 
estimate the average values of \(l) and (- 1 - S )X(t - 1). The statisti¬ 
cal estimator that is used for this problem involving current and 
lagged dummy endogenous variables is discussed in Section VC. Re¬ 
sults from estimating equation ( 2 ") are discussed as the constrained 
model in the following sections. 

D. Mining Practice 

because of unusual preliminary results, standard mining practice was 
investigated more carefully. It was found that rules of thumb in the 
mining industry could be used as a heuristic alternative hypothesis. 
The haste model abstracts from the fact that mine operators typically 
produce from a heterogeneous block of ore and so must choose the 
grade of ore to mine at every depth. However, because of the produc¬ 
tion process it is difficult to control exactly the grade of ore that is 
extracted. It may be that variations in the grade of ore are just part of 
the many small errors the firm makes in actually choosing and ex¬ 
tracting the output. In this case the basic model may still be a good 
description of the behavior of the firm. However, in case the basic 
model is not accepted as a model of the firm, the standard mining 
practice suggests an alternative hypothesis. 

The complication of choosing the grade of tire is known as the 
cutoff-grade problem in mining engineering. While the iheory of this 
problem appears to be in a state of flux (Rudenno 1974; Dowd 1976), 
common practice is such that when output prices rise the firm reduces 
the lowest grade of ore that is mined. The argument that is used to 
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justify this rule of thumb is that the current higher price pays the 
higher extraction and processing costs implied by lower-quality ore; 
that is, higher marginal revenue leads to producing higher-marginal- 
cost units. 10 Indeed, this is apparently such a common practice that 
Harold Drake, silver analyst for the Bureau of Mines, stated: “Mine 
production declined moderately in 1979 mainly as a result of process¬ 
ing larger quantities of lower grade ore made economically possible 
by the higher prices in 1979” (U.S. Department of the Interior, 
Bureau of Mines 1980, p. 802). 

When output is measured by units of pure metal, the standard 
practice of cutoff-grade mining could lead to the continual forward 
shifting of production if price increases dominated the sample pe¬ 
riod. The reason is that this practice would shift the extraction of 
marginal units of pure metal to lower-priced periods in the future. In 
addition, some constraint on production is required such that the 
quantity of ore is not increased sufficiently to offset the decline in the 
grade. Postponing the production of the marginal unit of output to a 
time when the price is lower may cause the data to reveal a negative 
coefficient in place of 8, the rate of discount, in the Hotelling model. 
It must be emphasized that this is not a formal alternative hypothesis. 
The coefficient estimates obtained to test the Hotelling model are 
strictly interpretable only within that framework. However, the data 
set used in this study does encompass a period of rapidly rising metal 
prices. Expectations of rapidly rising prices may have dominated the 
sample period even though an ad hoc sign test showed 42 declines in 
price and 4 1 price increases. 

Standard mining practice therefore provides a heuristic alternative 
hypothesis that is qualitatively distinct from the Hotelling model. In 
the Hotelling model a period of rising prices is compatible with the 
basic model or with an increase in demand that leads to increased 
production in the higher-priced time period (Herfindahl and Kneese 
1974). The latter case may be appropriate to the late 1970s, when 
many resource commodity prices increased rapidly. The estimates of 
8 would be positive in either case. 11 In contrast, rising prices under 


lu It 15 not yet established whether this rule of thumb is consistent with discounted 
profit maximization. Preliminary results obtained by Krautkraemer (1984) indicate it is 
not. 

11 The estimate of 8 remains positive for three reasons when prices are rising because 
of increasing demand Considei the shifting demand as an omitted variable. If the 
shifts in industry demand, which are exogenous to the firm, result in a constant dollar 
increase in the in situ value, then only a constant needs to be appended to eq. (2'). The 
expected value of 8 would not lie affected. The regressions were, in fact, estimated with 
a constant term. If the increase in demand caused the in situ value to increase by a 
nonconstant amount, then we must note that the increase in demand will lie positively 
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standard mining practice imply that the benefit of extracting a mar¬ 
ginal unit of output at a high price is traded off for extracting that 
unit in the future at a lower price. This may be revealed as a negative 
estimate of 8. 


IV. Transition to a Testable Model 

In order to test the hypotheses derived from the model of optimal 
resource extraction, it is necessary to replace the marginal cost of 
extraction and the stock effect, neither of which is directly observable, 
by values obtained from estimating a cost function. The stochastic 
structure of the estimators is complex due to the use of time-series 
data, a system of equations, and lagged endogenous variables. 

A. Functional Form of the Cost Function 

The resource model requires estimates of cost as a function of current 
output and the stock variable. The theoretical model defined the 
stock variable to be the stock remaining in the mine. This information 
is unavailable. However, the depth of production is known. The rela¬ 
tionship between the depth of production, D(t). and the stock remain¬ 
ing in the mine can be defined as follows. Recall that the theoretical 
model assumed a known stock of ore, B, and a constant grade of ore 
in the mine. Define K to be the quantity of metal per foot obtained 
from the constant grade of ore. Then the stock of metal remaining in 
the mine is 

X(t) = B - K* D(t). (7) 

Extraction is assumed to proceed linearly from foot 0 to the depth at 
time t. T his implies that K * D(t) is equivalent to cumulative produc¬ 
tion. Note also that the inverse relationship between depth and stock 
in the mine implies that the partial derivatives of the cost function 
with respect to depth are opposite in sign to those with respect to the 
stock. Rearranging equation (7) yields: 

m = jr - ( 8 ) 


correlated witli ihc lagged in situ price. As is well known, this will cause the estimated 
coefficient for m(t — I) to include 8 and a proportion of the positive coefficient for the 
demand variable. The expected value of S remains positive. Finally, the shifts in de¬ 
mand may just be one of many small factors that comprise the error term. In this case 
the expected value of 8 is positive and equal to the firm’s discount rate in the basic case. 
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This equation can also be rewritten in terms of current output. Sub¬ 
tracting lagged depth from both sides of the equation, rearranging, 
and canceling yields 

D(t) = D(t - 1) + (9) 

Use has been made of the fact that A X(t) = Equation (9) will be 

exploited in the calculation of the marginal stock effect. 

The transformation between depth and the stock in the mine is a 
possible source of specification error because it embodies three major 
assumptions: constant grade, known stock, and monotonically in¬ 
creasing depth. The usefulness of this transformation is that it will 
retain a rigid link to the theoretically expected coefficient for the 
stock effect in the dynamic efficiency condition. 

A weakly separable cost function is used to model the production 
characteristics of the firm. Thus 

C(W, D, R) = f(y/)g(R)h(D). (10) 

Structural forms of this type assume that the underlying production 
function is homothetic. Furthermore, no interaction is allowed be¬ 
tween the depth and output. Thus output affects costs independently 
of the depth of extraction. The independence of output and depth in 
the specification is the result of tests conducted with the data. 12 The 
functional form was further restricted to be homogeneous. The trans¬ 
log functional form was chosen for estimation (Christensen, Jorgen¬ 
son, and Lau 1973: Diewert 1974). This functional form is widely 
used because of its interpretation as a local second-order approxima¬ 
tion to an arbitrary logarithmic cost function. 13 With the previous 
assumptions, the translog takes the following form: 

In C = <i 0 + AR * In R + AD * In D 

5 ft 

+ - 5 * X X “v 1,1 W ' ln W ; (il) 

i=t,=l 

5 

+ x«. |n w <- 

1 = 1 

Note that the degree of homogeneity is MAR (Silberberg 1978). T he 
coefficients of the cost function are restricted to be homogeneous of 


12 The cost (unction was also estimated lor the model that did not impose indepen¬ 
dence of depth and output. An f'-tcst of the two cost models could not reject the 
hypothesis that the two.specificalions are equivalent. 

‘ ’ See Lau (1974) for an elaboration of different definitions of local second-order 
approximations. 
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degree one in input prices. I hat is, the a, coefficients sum to one and 
the a,j coefficients sum to zero. Symmetry conditions are also imposed; 
that is, a t) ~ a r . 

The efficiency of the estimation procedure can fie increased by 
applying Shepherd’s Lemma to equation (11). The factor share equa¬ 
tion (12) may then be estimated simultaneously with equation (11) 
and with cross-equation restrictions imposed as implied by the a, and 
Vs:" 

W,L, _ d In C 
c ~ H in w, 

* ( 12 ) 

= a, + a <, In W' ; , i = l, 2, 3, 4. 

1 ~ i 

This procedure increases statistical efficiency by increasing the num¬ 
ber of observations without increasing the number of parameters to 
be estimated. 

B. Stochastic Specification and 
Estimation of the Cost System 

Recall that the data are generated by a firm in a competitive market. 
This implies that the firm chooses output and depth in order to max¬ 
imize discounted profits. I he endogeneity of these variables creates a 
simultaneous equations bias if the actual values of depth and output 
are used in a regression analysis. However, Fair (1972) has developed 
an instrumental variables estimator that yields consistent and efficient 
estimates for simultaneous equations in the presence of lagged en¬ 
dogenous variables and autoregressive errors. The appropriateness 
of this estimator is made clear by considering the implicit cost equa¬ 
tion system to include not only the cost and cost share equations but 
also the equations for the logarithms of output and of depth. The 
latter two equations will be used to specify instruments for the estima¬ 
tion procedure but will not be jointly estimated with equations (I 1) 
and (12). 

As is well known, differentiating a well-specified profit function 
wit ft respect to the output pric e yields the supply function, which in 
this case is conditional on the stock and, hence, on depth as well: 1 ’ 

an* 

= **(* W, D). 

11 The leslritlion thal the derived demand turn lion be zero-degree homogeneous 
was not imposed. 

r> Diewert and Lewis (1981) use tins result in their derivation of the comparative 
dynamics of stock accumulation. 
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However, the derivative of a translog profit function yields an inverse 
of the profit share of gross sales, which does not generate a suitable 
estimate of the logarithm of output. The specification of the 
logarithm of output used in this study is an ad hoc formulation using 
variables that economic theory indicates are correlated with output. 
Consequently the logarithm of output was chosen to be a function of 
the logarithm of the output price and the square of the logarithm of 
that price in addition to the logarithms of depth and input prices. 
Similar reasoning with respect to equation (9) results in specifying the 
logarithm of depth to be a function of the logarithm of lagged depth 
and the logarithm of output in the estimation procedure. 

The need to consider an estimator that allows for both lagged en¬ 
dogenous variables and autocorrelation is now apparent. If lagged 
depth were used as an instrument in an estimation procedure, say 
three-stage least squares, and autocorrelation exists, then the es¬ 
timator would be inconsistent. This is because any instrumental vari¬ 
able constructed as a function of lagged depth would be correlated 
with the autoregressive error term. lh Consistent parameter estimates 
using Fair’s estimator are obtained by using both current and lagged 
values of all the exogenous variables as instruments for the endoge¬ 
nous variables of output and depth. In addition, the cross-equation 
restrictions implied by the cost share system are imposed, but the 
information contained in the cross-equation correlation of the errors 
is not exploited. In essence, the estimator that generates the reported 
values may be considered analogous to instrumental variables estima¬ 
tion with cross-equation restrictions imposed. 

C. Derived Data 

The estimation of the cost system is necessary to compute the data 
needed to estimate m(t) and the stock effect. Once estimates of the cost 
function are obtained, values of the marginal cost, Cr, and of the stock 
effect, Cy, are obtained in the following manner. Manipulation of the 
elasticity of cost with respect to output yields the marginal cost of 
extraction defined as 




d In C(t) C(t) 
d In R(t) * R(t )' 


(13) 


The estimated value of the marginal extraction cost is computed by 
evaluating the terms at the observed prices for the time period. Esti- 


The three-stage least-squares procedure was used in a preliminary analysis. Analy¬ 
sis of the autocorrelation function of the errors from this estimator indicated an auto¬ 
regressive error structure of order greater than one. 
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mates of cost and output are obtained, respectively, from the pre¬ 
dicted values of the cost equation and from the first stage of the 
estimator. The estimated values are used to obtain a consistent esti¬ 
mate of Cft(t) based on the consistency of a random variable con¬ 
structed as a function of consistent parameter estimates. 

The stock effect is derived in a similar fashion. With reference to 
equation (8), note that 

j In D{t) = _ 1 n4 , 

dX(t) B - X(t)' y > 


The right-hand side is the negative of the inverse of cumulative ex¬ 
traction. Utilizing the definition of the derivative of a logarithm, the 
chain rule, and substituting the equation above yields: 


CxU) = 


CO) d In CO) 
B - X(t) * a In DO )' 


(15) 


The estimated value of the stock effect is computed in a manner 
similar to that used to compute the marginal extraction cost. The only 
diff erence is that the actual value of cumulative extraction, B - XO), 
was used instead of an estimated value. The error in using this ap¬ 
proximation to the true, optima! level of cumulative extraction is 
assumed to be small. The resulting estimate of C\(<) is assumed to be 
an instrumental variable that is uncorrelated with any behavioral er¬ 
ror term in the tests of economic efficiency. 

The parameter estimates obtained by applying Fair’s jrrocedure to 
the cost share system resulted in the following estimates for C K and 
C x , where AC(/) is the average cost: 17 

C rt 0) = .06 * ACO ), (16) 


c ' (/) = - 23 *r^w (17) 

Both of these estimates are of the expected sign. However, the mar¬ 
ginal cost is less than the average cost, implying economies of scale. 
This result reinforces the usefulness of analyzing the constrained 
model as the estimated economies of scale are very large. 18 


17 The complete recession results arc presented in App, B. 

'"The serond-oidcr properties of the estimated cost function also indicate the 
usefulness of analyzing alternative models. The cost function is expected to be mono- 
tonicalfy increasing in W and concave in W. Monotontcity is met for all observations 
and, based on point estimates of 420 own-price (C„) second derivatives, 410 were 
negative in accord with theory However, the cost function is neither concave nor 
convex in input prices at any observed set of input prices. Eigenvalue tests foT all data 
points indicate a saddle point due to the presence of both positive and negative eigen- 
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The value of m(t), the shadow price of the resource, can now be 
constructed. The estimate of the marginal extraction cost is sub¬ 
tracted from the output price to compute «*(/)■ ,<J The estimates of m(t) 
and C x (t) are then used in the estimate of the dynamic efficiency 
condition. The estimators used for this condition and the results are 
reported in Section V. 

V. Stochastic Structure and Tests 
of Dynamic Efficiency 

This section describes the stochastic model of the dynamic efficiency 
condition. Several alternative estimators are discussed because the 
stochastic structure is different for alternative models of the dynamic 
efficiency condition. Finally, the estimation results are presented and 
discussed. 

A. Estimation and Testing of the Basic Model 

The basic model, equation (2'), was first estimated using OLS. The 
Durbin-Watson D-test and the partial autocorrelation function were 
examined to determine if an error follows an autoregressive pro¬ 
cess. 20 As the errors appeared to be autocorrelated, the order of the 
process was determined by the first P statistically significant values of 
the partial autocorrelation function. An estimator developed by 
Hatanaka (1974) was used to obtain consistent and ef ficient parame¬ 
ter estimates. 21 The results of this procedure are reported under the 
heading “Hatanaka.” 


values. While the point estimates ot the eigenvalues do not indicate coiicavuv, the 
stochastic properties of these estimates are unknown, so that no test tor tile significance 
of these results can be obtained. 

19 The output price is the net smeller return received by the mine operator. This 
price is composed of the net smelter returns from three jointly produced products. As 
the products are assumed to be produced in fixed proportions the aggiegate output 
price is a fixed-weight average of the component prices. T his is discussed in more detail 
in App. A. 

20 In the presence of a lagged dependent variable, the D-W D-test is biased towaitl 
two For this reason, particular attention is paid to the diagnostic results of the partial 
autocorrelation function (Box and Jenkins 1970). 

21 The Hatanaka estimator requires two stages to achieve consistent and efficient 
estimates. The hrst stage is the construction of consistent paiameter estimates. T hese 
estimates are provided by the instrumental variables estimator using current and 
lagged exogenous variables as instruments The second stage uses the estimated errors 
from the instrumental variables estimator to form a consistent estimate of the autore¬ 
gressive parameters. T his estimate is computed by an OLS regression of the error lei m 
on its own lagged value. The Hatanaka procedure then uses the estimate of the autore¬ 
gressive parameters to transform all the right- and left-hand-side variables in a manner 
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TABLE 1 

Dynamic Ekhcikn< y Bask. Moiif.i. 

Hatanaka 



OLS 

Aft( 2) 

AR(\) 

s 

- 10 

- 1.1 

- 28* 

(SE) 

(06) 

(.07) 

(.10) 

(fi 

— 402.2b 

-588.91 

- 1,840.99 

(SE) 

(559 24) 

(647 80) 

(802.64) 

ft-' 

.05 

.05 

i 1 

I)-W /^-statistic. 

1.20 

1 81 

1.09 

Lag of significant partial 
autoc or relations 

1,2 

0 

o 

Nrm —'Dilit’i cudlinenif .m* i 

rcpoiml in Aj>|> B 




* - Si|(iii(uanl at (l»e *l'i ix'iii'iK k*wl 


The key results from tltese estimators are reported in table 1. Anal¬ 
ysis of the partial autocorrelation function from both the OLS es¬ 
timator and the instrumental variables estimator indicated that an 
AR( 2) correction would be appropriate. However, an /'-test for the 
joint significance of all the coefficients after the AR{ 2) transformation 
rejects the set of explanatory variables as significant regressors. While 
the estimates of 8 and (3| cannot be rejected at the 95 percent level of 
confidence as equaling their respective null hypothesis values, this is 
relatively weak consistency between data and theory. The estimated 
values of 8 and p, are also insignificantly different f rom a great many 
negative values that would lead to a rejection of the basic model. 

The lack of robustness of the basic model with respect to the AR 
specification is apparent when an AK( 1) process is assumed. This 
specification leads to a flat rejection of the basic model as the estimate 
of 8 is significantly negative, even when joint confidence intervals are 
constructed." Because of the sensitivity of the basic results to the AR 
correction an alternative test of the basic model is developed and 
implemented below. 


analogous to the Cochiane-Orcult pioceduir. That is, p(i)lag(i)(i!) is subtracted ftoin 
each sanable Z. where p(i) is the estimate of the ith autoregressive parameters and lag(i) 
is the lag operator The final step is obtaining OLS estimates using the transformed 
variables. Lfhciency is obtained by im hiding the i lagged eiror terms from the instru¬ 
mental variable estimate as regressors. Hatanaka shows that this estimator is consistent 
and efficient and has a covariance matrix correctly reported by an OLS procedure 
22 The 95 percent joint confidence interval for 5 using the Bonferroni inequality is 
-.50 =£ 8 ^ - OB 
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B. An Average-Cost Estimator 


The test of dynamic efficiency is conditional on, among other factors, 
the representation of the cost function. The test of the dynamic 
efficiency condition can be restated in a way that avoids explicit esti¬ 
mation of the cost function. Recall that the assumed structure of the 
cost function provides a simple model of marginal extraction cost. 
With reference to equation (13) and noting that AC(l) is the average 
cost, 


C 


R 


d In C 
6 In R 


* AC(t). 


Using equation (1), the identity for the shadow price of the resource 
becomes rn(t) - P(t) — [A/? * AC’(t)]- Recall that AR is the coefficient 
for the logarithm of output in the cost function. Substitution of the 
preceding equation and equation (15) into equation (2') yields a con¬ 
dition for dynamic efficiency that is equivalent to the basic model. 
After rearranging, an alternative specification to test the dynamic 
efficiency condition is 

AC(t) = A Vi P(t - 1) + V 2 PU) 

Cm < 18 > 

A y,AC(t - 1) A v, 7T=Sh)’ 

where 


1 

A 5 

1 

Vi = - - 

AR 

" AR 

V 3 = 1 A 

8 


total cost 

li - 

X(t) — cumulative extraction 


'T his estimator is attractive for both its greatly reduced data require¬ 
ments and its reduced computational burdens. Note that the cost 
system does not have to be estimated and only output prices, output, 
total cost, and cumulative production need to be known. However, 
the stochastic structure of this model is more complex because of the 
presence of current, C(t)/[B - X(t)], and lagged, AC(t - I), endoge¬ 
nous variables. The instrumental variables estimator was first used to 
obtain consistent parameter estimates under the null hypothesis of no 
autocorrelation. However, the partial autocorrelation function of the 
errors from these estimates indicated the likely existence of first- 
order autocorrelation. Fair’s (1970) estimator was used to construct 
consistent and relatively efficient parameter estimates. This estimator 
is an iterative instrumental variables procedure that constructs an 
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TABLE 2 

Average-Cost Estimates of Dynamic Efficiency 



Constant 5 

Fair 


Varying S(1) 
Inst. Var. 

6 

-.35* 



(SE) 

( 10) 



AH 

-fi.86 


-5.32* 

(SE) 

(3.94) 


(2.61) 

Pi 

na. 


n.a. 

a 2 



.41* 

(SE) 



(.09) 

Corr 2 

.89 


.87 

D-VV /f-statistit 

1.98 


1.98 

Lag of significant partial 
autocorrelations 

13 


0 


Noif — Other <ocHicu*nts atr repotted m App B 
* = Sigi uh< am 41 the pcru*ni level 


instrument for the current endogenous regressor and then searches 
over values of the autoregressive parameter to minimize the sum of 
squared residuals. The results of estimating equation (18) are re¬ 
ported in table 2. 

The standard errors that are reported in table 2 are transforma¬ 
tions of the standard errors estimated in the regression equation. 21 ' 
The estimate of 8, which requires only the subtraction of a constant 
from the estimate of y ; i. does not change the standard error. The 
standard error ot AK is a second-order approximation to the variance 
of a ratio of a random variable (Mood, Graybill, and Boes 1974). 

t he results of this procedure are consistent with earlier tests. With 
teference to the estimates from the Hatanaka procedure with the 
Aff(f) correction, the estimate of the discount rate from the average- 
cost estimator is significantly negative and cannot be rejected as being 
equivalent to the earlier results at the 95 percent level of 
significance. 2i While these results are undoubtedly subject to differ¬ 
ent degrees of interpretation, a regression without any explanatory 
power and two sensitivity tests that flatly reject the null hypothesis can 


2 ’ The estimates <>I the standard errors arc based on the formulas provided by Fair, 
which assume zero correlation between the lagged endogenous variables and the error 
term, which leads to undetstating the true covariance matrix. Note also that the param¬ 
eter for the slock effect is unidentified and the parameters for output and the discount 
rate are overidentihed. 1'he nonlinear restriction implied by ihe theoretical values of 
Hi, Ha, and Ht is not included in the estimation procedure or in the tests of the hy¬ 
potheses. 

21 The hypothesis lest used a f-ratio and assumed that the covariance of the estimates 
was zero. 



EXTRACTION 477 

hardly be said to provide assurance of an accurate portrayal of the 
behavior of the firm. 25 

The first sensitivity test of the basic model allows the discount rale 
to be entered as data and to vary over time. The revised average-cost 
estimation model is then: 


AC(t) = an + oiiPlfl) 


+ a,ACW -D + «» [ „ ] 

where 


pm 

AC\(t - 1) 

= P{t) - [1 + 8(01P(t - 
= [1 + 8 it - 1 )\AC(t - 

<*i 

1 

AR 

> 

a 2 

= 1. 


as 

= -P. 

m 

C(t) 

= total cost, 

B - X(t) 

= cumulative extraction. 


(19) 


This equation was estimated using the instrumental variables es¬ 
timator with the predetermined variables and input and output prices 
as instruments. Analysis of the resulting partial autocorrelation func¬ 
tion indicated an absence of any serial correlation, so that this es¬ 
timator is expected to yield consistent parameter estimates. 

The results of estimating the time-varying discount rate model are 
also reported in table 2. The parameter estimates of ot| and a L ) are 
significantly different from their expected values; is significantly 
different from one and aj is a significantly negative number. These 
results indicate an inconsistency between the data and the alternative 
model that allows the discount rate to vary over time. 


C. The Expected-Price and the Constrained Model 

The inapplicability of the basic model and the economically unsatisfy¬ 
ing estimation of the cost function indicate the usefulness of analyzing 


** The coefficient on output, though of the wrong sign, is insignificantly different 
from zero and from many positive values. 
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the expected-price and constrained models. Estimation of the ex¬ 
pected-price model does not present any added difficulties. However, 
estimation of the constrained model is slightly more complicated. Re¬ 
call that this model includes a current and lagged dummy variable 
that is equal to one if the rate of extraction equals or exceeds the rated 
capacity. This is plainly a case of a dummy variable defined by an 
endogenous variable crossing a threshold. Heckman (1978) and Dun¬ 
can (1980) have investigated models of this type. For the limited in¬ 
formation model estimated as equation (2"), Heckman shows that 
consistent estimates can be obtained by forming an instrument for the 
dummy variable using a linear probability model. The instrumental 
variable estimator can then he applied in a straightforward way. 

An analysis of the partial autocorrelation function of the error 
indicated significant first-order autocorrelation. Because both current 
and lagged endogenous variables occur in equation (2"), it would 
appear that Fair’s instrumental variable estimator is appropriate. 
However, the use of a linear probability model to develop an instru¬ 
mental variable precludes tfie use of a standard multistage program. 
Therefore an instrumental variable for the dummy variable was es¬ 
timated in a separate regression using the instruments required for 
Fair’s procedure. Fite instrumental variable still could not he used in 
a standard AR( 1) correction package as the product of the autore¬ 
gressive parameter and the true lagged value, not the instrumental 
variable value, is to he subtracted from the instrumental variable. The 
actual parameter estimates were obtained using a nonlinear least- 
squares routine for the model 

M(t) = ft, + pAA(/ - 1) + 5 A(l - 1) - p5 A(l ~ 2) 

+ 0,6\(/) - ppT'vfi - 1) + £,DWAT(t) 

4- p : , - (p 2 p)Dl(f - 1) - fJ,p/)I(< - 2), 

where p is the autoregressive parameter and DMlAT is the predicted 
value from the linear probability model.' 1 ’ 

The results of estimating the expected-pi ice and the constrained 
models are reported below in table 3. Both models lead to the rejec¬ 
tion of the hypothesis that 5 is positive.' 7 The conclusion to be drawn 


I lie predit led values were determined by the fitted values (rain an OI.S regression 
with values less than zero or greater than one set equal to zero and one respectively 
The nonlinear least-squares routine used Gauss’s method The standard errors re¬ 
ported from the nonlmeai procedure would be incorrect because the instrumental vari¬ 
able. 1)\HAT, would he a component of the piedtcted values. Consequently Fair's esti¬ 
mator of the covariance matrix was estimated separately It is these latter estimates that 
are teported as the standard nrors. 

27 The Bonferroni |ouit confidence interval for 5 is -.117 8 s —.05 for the ex¬ 

pet ted-price model and - fi5 •£ 8 « -.11 for the constrained model. 
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TABLE 3 

Dynamic Efficiency Expected-Prick and Consi rainkd Models 
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Expected-Price Model 

OLS 

Constrained Model 
Fair 

s 

-.21* 

-.38* 

(SF.) 

(.07) 

(.12) 

Pi 

- 1,202.35 

- 1,336.94 

(SE) 

(715.90) 

(1,636.37) 

R 3 \ correlation squared 

.10 

52 

I)-W Z)-statistic 

1 81 

n.a. 

Lag of significant partial 
autocorrelations 

0 

0 


Noil —Otliri ifHjffic'ienn arc hi App B 

* •> Significant at the 95 pen cm level. 


is that neither of these two sensitivity tests of the set of maintained 
hypotheses and the Hotelling hypothesis is consistent with the process 
that generated the data. 

VI. Conclusion 

This paper has applied a new data set with previously unavailable 
information on prices and cost to the extraction problem of a mining 
firm. A basic model of optimal extraction was reviewed and additional 
assumptions were incorporated to construct an empirical test of the 
model. The basic model was constructed to lest Hotelling’s model of 
optimal extraction. Three variations of the basic model were devel¬ 
oped to incorporate a time-varying discount rale, price expectations, 
and a constraint on the rate of output. 

The conclusion obtained from the empirical results is that the set of 
the basic Hotelling hypothesis and all the maintained hypotheses is 
rejected as a description of the firm’s behavior. This conclusion is 
based on an estimate of the parameter that is expected to reveal the 
firm’s discount rate, 8, but instead is estimated to be significantly 
negative in the AR( l)-corrected version of the basic model. This re¬ 
sult is also obtained for the time-varying discount rate and the ex¬ 
pected-price and constrained output models. I'he ^/f(2)-corrected 
version of the basic model has no explanatory power. 

The results obtained in this study are important because they reject 
the empirical application of a widely used model as a positive descrip¬ 
tion of the behavior of a mining firm. While it is true that rejection of 
any of the models tested may have resulted from the maintained 
hypotheses necessary to advance to the estimation stage, it is clear that 
the Hotelling model is insufficiently robust to be confidently applied 
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1 the Hotelling model, either in the 

SSgZSSZZ »udy or augmented to include growth in the 
resource stock, is used to describe behavior in a w.de var.ety of natu¬ 
ral-resource-based industries. The rejection of the model as a descrip¬ 
tion of a mining firm indicates the fruitfulness of increasing; the effort 
devoted to the empirical examination of the extraction decisions of 
firms in the fishing industry, tfie forest products industry, and the 
crude petroleum industry, among other possible cases. 

In a broader context, rational resource policy debates are imposs¬ 
ible it available models are not consistent with the actual behavior of 
firms. If an accurate description of behavior is unavailable, there is no 
basis on which to argue that firms extract too much, too little, or just 
the correct amount. Therefore, further research is indicated to con¬ 
struct models, including improved representation of stock effects in 
cost functions or other components of the maintained hypotheses, 
that do describe the actual behavior of firms that extract natural re¬ 


sources. 


One direction for research is to relax the assumptions of the basic 
model. I’he results of this study and standard mining practice suggest 
that the assumption of a constant grade, even for a particular mine, 
may be misleading. It is important to realize that the ore-grade prob¬ 
lem is not that typically discussed in economics as "multiple grades.” 
The problem a mining firm can face is a distribution of grade at every 
depth. The cost of extracting a fixed range of grades is likely to have 
an associated stock effect. If the physical limits of mining are set to 
extract only a range of high grades, the increased depth associated 
with pursuing the high grades may eventually raise costs to the point 
that justifies extracting lower grades at shallower depths. The prob¬ 
lem may be complicated by costs associated with returning to previ¬ 
ously mined depths. For instance, costs may be incurred to clear rock 
tails or to restore the roof support. Formal empirical tests incorporat¬ 
ing uncertainty, more refined models of price expectations, and mod¬ 
els distinguishing between extraction for development and for pro¬ 
duction may also prove fruitful directions for empirical research. 
However, these theoretical complications may not be empirically dis¬ 
tinguishable without an increase in the statistical efficiency of the 
parameter estimates. Alternative econometric estimators that focus 
on unobserved variables and vector autoregressive systems would use¬ 
fully supplement the development of more complex theoretical mod¬ 
els. Finally, it should be recalled that the results in this study are 
obtained by analyzing data from a single firm. Further research is 
justified in trying to extend the quantity of data available to test the 
microeconomic behavior of firms that extract natural resources. 
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The company generously made available virtually all accounting records for 
the period studied. A majority of the data are obtained from income state¬ 
ments for internal use. These statements were available monthly for the pe¬ 
riod from January 1975 to December 1981. Extensive use is also made ol 
reports on the concentrating process. Other firm-specific sources included 
invoices in order to obtain input prices, inventory records, and miscellaneous 
operating reports. Several variables included in the estimation ol the cost 
function are aggregates of company data. These are discussed in the follow¬ 
ing paragraphs. 

The particular mine being studied produces three joint outputs. The as¬ 
sumption of a constant grade implies that these products are produced in 
fixed proportions. Thus the costs will be proportional to a measure ol only 
one of the outputs. A ton of pure lead is used as the measure of units of 
output. 

The calculation of the output price is also based on the joint products being 
produced in fixed proportions. If A | } is the measure ol fixed proportion 
between metals one and j, then the unit price (per unit of metal one) is 

> p 

P(t) = /', + 2. -r- 

I - 2 * 

The k i ; were computed by a least-squares regression on the monthly average 
grades of the metals. The price data used in the calculation are the net 
smeller returns, the price the mine operator estimates that he receives I 01 the 
product. Estimation of the net smelter returns is a complex piocedure be¬ 
cause of the complexity of the smelter contract. Therefore the true prite may 
differ from the price that the firm believes it is receiving for each product. 

Input prices are also constructed from functions of the available data. The 
separable input price component of the cost function, W, is itself assumed 
separable in the following manner: 

g( W) = C[a(production labor), />(nonproduction labor), ((mine supplies), 
(/(concentrating plant inputs), e(capilal)]. 

Separability requires that marginal rates of substitution within the mem¬ 
bers of the separable class be independent of tfie level of inputs in the other 
classes (Leontief 1947; Berndt and Christensen 1975). The input categories 
are chosen for the following reasons: 

1. The concentrating plant embodies different input trade-offs than the 
mine plant. 

2. Research by Christensen and Berndt (1974) in the manufacturing sector 
and observation of the mining process led to assuming that production 
and nonproduction labor are separable. 

3. Supplies and capital are assumed to be separable categories based on com¬ 
mon usage. 

Some of the input price data are directly available from company reports. 
The company itself aggregates production labor, reporting an average pro¬ 
duction worker’s wage per mine shift. The reported average cost of non¬ 
production workers is also used as an input price. 
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Laspeyres price indices for both mine supplies and the concentrating plant 
were constructed (coin cost share and input price data. This form of aggrega¬ 
tion assumes fixed proportions within the aggregated class. In each of these 
aggregate input price categories, firm-specific input prices were available for 
items that comprised large shares of the total cost, These items, respectively, 
accounted for 42 and .37 percent of the mine supply and concentrating plant 
costs in the base period. For instance, the mine supplies index includes data 
on the price of explosives, rock bolts, timber, and electricity. The concentrat¬ 
ing plant index includes data on electricity, reagents, and inputs for the 
crushing segment of the process. In both cases, the portion of the index cost 
that was not covered by firm data used him data weights but used the 
wholesale price tndex lor all commodities except farm products as the rele¬ 
vant price. 

The measure of the input price of capital is the prime rate for short-term 
business loans While ilns price does not include the complexity of the hi nt's 
financial stmeune or taxes, it was chosen as a leasonable instrument because 
the firm borrows substantial sums of money al the prime rate of interest plus 
a fixed percentage. 

I he depth of production, D(t), is a weighted-average depth. The weights 
used ate the propoi(tonal tons of material extracted from a given depth ol 
the mine per month. 

The indicator variable 1 )1 identifies the lime periods that actual output 
equaled or exceeded rated capacity. 1 he tated capacity was obtained from 
published engineering studies that cannot he c itccl without revealing the mine 
used as the case study. 


Appendix B 

Detailed Regression Results 

I he billowing tallies present (he computer output repotted ill the text. Vari¬ 
ables are defined m Appendix A. I he following naming conventions are 
used 

1 KI’B - output 

2 I> - depth 

3 I’ - output juice 

4 M - scan itv rent 

, r > I)M ~ In st chfletence ol the scan tty rent 
(>. (.X ~ sloe k el fee t 

7 \ — total tost doided by c(emulative extrastion 
K. W1 ~ jiroduc lion laboi, doll .11 s jicr manshift 
(>. VV2 -■ nonptotliHlion labor, dollars j>er manslult 

10. W.3 -- l.asjicyies jince index ol mine supplies 

11. W4 = l.asjicyies juice index ol concentiating jilatil costs 

12. W5 = monthly average prime tale for short-term business loans (Board 
of Governors 1975—HI) 

13. I. as a first letter is a logaiitlttnu value unless "lag" is sjyelled. Scjuate 
terms contain only one L, foi example, LW1W2 = l.W! * I.W2. 

I t Lagi refers to a lag of length 1 A 1-period lag otnits the 1 . 

15. Variables preceded by “ONK." refer to estimated values obtained from 
an instrumental variables procedure. 
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16. RHO as part of a variable name refers to a variable that has been cor¬ 
rected for first-order autocorrelation. 

17. E and U are both used to describe estimated error terms. They may be 
followed by the estimator that generated the estimates; for example, 
EOLS refers to errors obtained from an OLS regression. 


TABLEBI 

Cost Function 


Variable 



Parameter 

Estimate 


Standard 

Error 

INTERCEPT 



10.898 


1 815 

ONE.LRPB 



.055 


043 

ONE.LD 



.233 


.194 

LW1 



— .407 


123 

LW2 



-.107 


040 

LW3 



1 05H 


.118 

LW4 



.290 


.114 

LW!) 



227 


095 

LW1W1 



.240 


033 

LW 1W2 



013 


010 

LWIW3 



-.181 


030 

1.WIW4 



- 020 


028 

LW1W5 



- .056 


010 

LW2W2 



.030 


007 

1.W2W3 



- 005 


.01 1 

LW2W4 



- 018 


013 

LW2W5 



-.01 1 


004 

I.W3W3 



- 130 


.033 

I.W3W4 



150 


035 

LW3W5 



022 


.015 

LW4W4 



- 112 


040 

LW4W5 



-.000 


.012 

LW5W5 



.083 


022 

Correlation squared 

91 





Dependent variable. 

LCOST 






TABLE B2 


Dynamic; Efficiency. 

Bask. Monti, Haianaka Estimaiok (.4/f(l)| 


Parameter 

Standaid 

Variable 

Estimate 

Erroi 

INTERCEPT 

47.357 

06 1 19 

DIFRHOM 

-.281 

097 

DIFRHOCX 

- 1.340.99 

862.639 

I.AGUHAT 

R 2 .11 

MSE: 10,242.10 

D-W i)-statistic: 1.69 
Dependent variable’ RflODM 

- 170 

140 



TAB LE B3 


[hSAMIt 


Variable 

INTF.tiCU'l 
D1FRHOM 
DlhRlUH-X 
i.Ac;un.\ i 
l.ACXl'IIAI 
7.'- .IB 


t'mutM v Bami Mirou. Hatasaka Estimator (4/1(2)] 


-— Parameter 

Standard 

Estimate 

Error 

- 82.937 

65.203 

-.130 

.072 

-588.911 

647.302 

-.0)7 

.120 

-.014 

.1)8 


MSI-:. 15.2-17 I II 
11- VV U-siatisiic. 1 Ml 
Dependent wu table. RHODM 


TABLE B4 

Avkraok-Cosi Emimaior: Conmani Discount Rati;, Fair's F.stimator 


Parameter Standard 

Variable Estimate Knot 


INTERCEPT -81.154 51997 

I.ACAC (352 096 

I’ - 1413 .084 

I ACP 268 .087 

X 505 513 205.442 

(..oi lelation squared: 89 
MSF. 18,976.77 
I)-W D-statistii 1 98 
Dependent tunable R1IOAC 


1 ABLE B5 


AvKRAt.t-C.OVI 

Emimaior: Vakvino Discount RATt, 

lNSI Rl'MENTAL VARIABI tS 


Parameter 

Standard 

Vanable 

kstimate 

Error 

INTERCEPT 

- 10 335 

83.268 

1*1 

-.188 

.092 

AC 1 

.409 

.091 

X 

855.246 

268.245 


Correlation squared. .87 
MSF: 15,53165 
D-W /4-siatisiic 1.98 
Dependent variable: AC 
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TABLE B6 

Dynamic Efficiency: Expected Rkk.ea 
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Variable 

Parameter 

Estimate 

Standard 

Error 


Dependent Variable: 

P 

INTERCEPT 

61.126 

47.763 

LAGP 

1.497 

1 26 

LAG2P 

-1.015 


LAG3P 

.595 

286 

LAG4P 

- .195 

290 

LAG5P 

- .081 

247 

LAG6P 

.140 

.130 

ft 2 : .90 



MSE: 14,705.36 



D-W D-statistic: 1.89 




Dependent Variable DM 


INTERCEPT 

I.AGM 

CX 

ft 2 : .10 

MSE: 27,438.67 

D-W D-statistic: 1.81 

26.629 
-.206 
- 1,202.35 

88.351 

.073 

715 898 


TABLE B7 


Dynamic Efficiency: 

Cons chained Modf.i , Fair/NLIN 



Parameter 

Standard 

Variable 

Estimate 

Error 

INTERCEPT 

43.535 

234 774 

LAGM 

- .384 

.121 

CX 

- 1,336 94 

1,636.37 

D1 

363 350 

79 427 

LAGD1 

-39.122 

45.477 


Correlation squared' .52 
Dependent variable: DM 
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A firm's actions in one market can change competitors' strategies in a 
second maiket hy affecting its own marginal costs in that other mar¬ 
ket. Whether the action provides costs or benefits in the second 
market depends on (a) whether it increases or tlecreases marginal 
costs in the second market and ( b ) whether competitors’ products are 
strategic substitutes or strategic complements. The latter distinction 
is determined by whether more "aggressive" play (e.g., lower price 
01 higher quantity) by one firm in a market lowers or taises compet¬ 
ing firms' marginal profitabilities in that market. Many recent results 
in oligopoly theory can be most easily understood in terms of 
strategic substitutes arid complements. 


I. Introduction 

There are two main points to this paper. First, changes in a firm’s 
opportunities in one market may affect its profits by influencing its 
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competitors’ (or potential competitors’) strategies in a second 
oligopoly market. Second, two factors determine whether the result¬ 
ing changes in competitors’ strategies will raise or lower profits. They 
are (a) whether the two markets exhibit joint economies or dis¬ 
economies and ( b ) whether the competitors regard the products as 
strategic substitutes or strategic complements. 

Strategic substitutes and complements are defined precisely in Sec¬ 
tion III, but we give a rough explanation now: Conventional substi¬ 
tutes and complements can be distinguished by whether a more “ag¬ 
gressive” strategy by firm A (e.g., lower price in price competition, 
greater quantity in quantity competition, increased advertising, etc.) 
lowers or raises firm B's total profits. Strategic substitutes and comple¬ 
ments are analogously defined by whether a more “aggressive” strat¬ 
egy by A lowers or raises B's marginal profits. 

When costs are interrelated across markets a tax, cost, or demand 
shock in a monopoly or competitive market 1 has both a direct effect 
on the profits of a firm (the extra profit or loss the firm would make in 
that market without any change in output levels) and an indirect 
effect. After the shock the firm’s previous allocation of outputs be¬ 
tween market 1 and its other markets is no longer profit maximizing, 
since the marginal gain from selling a unit in market 1 has changed. 
The firm will reoptimize. After a tax cut on output sold in market 1, 
for example, we will see that the firm increases output in market 1 
and either increases or decreases output in a second market, market 
2 , depending on whether the markets exhibit joint economies or dis¬ 
economies. 

For small enough shocks we know (by the envelope theorem) that 
the reoptimization has a negligible effect on the firm's profits if it is 
also a monopolist or pure competitor in market 2. But these marginal 
changes in strategy can have first-order effects on the firm’s profits if 
it is an oligopolist in the second market. I he reason is that small 
changes in firm A’s equilibrium strategy in market 2 will cause small 
changes in its competitor B’s marginal profit schedule and thus in¬ 
duce small changes in B's market 2 strategy. These small changes in 
B’s strategy have first-order effects on A's profits. 

This strategic effect on profits exists in virtually any oligopolistic 
setting, including price competition, quantity competition, and collu¬ 
sive behavior. 

Section II provides a numerical example of the strategic ef fect. It is 
a simple Cournot model in which two firms sell in one market and one 
of them is a monopolist in the second market. The markets clear 
simultaneously so that the monopolist cannot precommit to staying 
out of the monopoly market. The strategic effect is so strong that a 
subsidy to sales in the monopoly market reduces the monopolist’s 
total profits. 
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In Section III we develop a simple model of the strategic effect, 
precisely define strategic substitutes and complements, and show 
their importance in determining whether the strategic effect increases 
or decreases profits. One cannot determine whether products are 
strategic substitutes or complements without empirically analyzing a 
market. For example, quantity competition and constant elasticity de¬ 
mand may yield strategic complements, but a linear demand curve 
with the same elasticity around equilibrium will always yield strategic 
substitutes. Price competition, also, can give either strategic comple¬ 
ments or strategic substitutes. 

We extend the analysis (in Sec. IV) to models of sequential markets 
where a firm makes strategic choices in one period taking into account 
their impact in a second period. Whereas in simultaneous markets a 
firm may be hurt by a monopoly opportunity, such a result is not 
possible in sequential markets if the monopoly market clears first. On 
the other hand, in the sequential case a firm may produce in a market 
in which revenues are less than costs because of the strategic implica¬ 
tions, but this cannot occur when markets operate simultaneously. 
Examples ol sequential markets include models of the learning curve 
(see, e.g., Spence 1981; I.ieberman 1982; Fudenberg and Tirole 
1988). We show the analogy to models in which firms make capacity 
investments in one period to deter entry in a second period (as in 
Spence 1977; Dixit 1980). In contrast to the earlier literature, we 
show that (with strategic complements) firms may strategically under¬ 
invest in capital to reduce the ferocity of future competition. Baldani 
(1988) and Schmalensee (1988) present related models with strategic 
effec ts in advertising. Many ol the results of our sequential analysis 
have been developed independently by Fudenberg and Tirole (1984). 

In Section V we explore the special cases of quantity competition 
and price competition. In each case we give intuitive criteria for deter¬ 
mining whether we have strategic substitutes or complements. 

Sec tion VI gives a series of applications in a variety of oligopolistic 
settings. We show that the choice of strategic assumption (strategic 
substitutes or complements) is the crucial determinant of the results 
of many oligopoly models. 

We conclude in Section VII. 

II. Numerical Example 

Consider a firm A that is a monopolist in market 1 and a duopolist 
with firm B in market 2. Assume that demand is infinitely elastic in 
market 1 at pi = 50 and that inverse demand in market 2 is pi = 200 
— (ji — pi, where is the output of firm i in market j. Total costs for 
A are C A — F + V‘i(q \ + qi}} 2 and total costs for B are, symmetrically. 
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C B = F + V*(q*) 2 - We assume F > 1,512 Vr, the fixed cost is relevant 
only because it prevents firms from wanting to set up multiple plants. 
For F > 1,512'/a, firms’ average costs will always be decreasing in the 
relevant range, even though marginal costs are increasing. 

In Cournot equilibrium q\ = 0, q£ = q* = 50. Each firm earns 
profits of 3,750 — F. Marginal revenue equals marginal cost of 50 for 
each firm in the markets in which it operates. 

Now imagine that something happens to either increase As mar¬ 
ginal revenue schedule or decrease its marginal cost schedule in mar¬ 
ket 1. For example, assume that a demand shock raises the price in 
market 1 to 55 or, equivalently, A is offered a subsidy of 5 for each 
unit sold in market 1. 

The Cournot equilibrium is now qf = 8, q£ = 47, q£ = 51. Mar¬ 
ginal revenue equals marginal cost of 55 in both markets for A, and 
MR = MC = 51 in market 2 for firm B. Firm B’s profits rise to 
3,901 V 2 — F, but A’s profits fall to 3,721'/a — F. The “positive" shock 
to market 1 has hurt A. 1 

We stress that the example does not depend on the assumption of a 
Cournot-Nash equilibrium. Nor does it rely on the particular func¬ 
tional forms chosen. Similar examples are possible whenever A has 
joint economies or diseconomies of scope, that is, ^C^/dqfdqf ^ 0. 
The main point of this example, however, is not the counterintuitive 
result that an increase in price in its monopoly market may hurt a 
firm—we would argue that if A knew that the price in market 1 would 
not exceed 55 it would find some way to precommit to not selling in 
that market. We constructed the example to dramatize a more modest 
claim—that in general As gain in profits from a change in market 1 is 
different when it is an oligopolist in market 2 than when it is a monopo¬ 
list or pure competitor in that market. 

III. Strategic Substitutes and Complements 

How could A lose from its increased profitability in market 1? II firm 
B had not changed its strategy from the preshock equilibrium, then 
dearly A would make more money as market 1 became more 
profitable. T hus it is the "strategic effect” of the change in B’s equilib¬ 
rium strategy on A’s profits that has overwhelmed any direct positive 
effect of the shock. In this section we present a simple model of the 
strategic effect and discuss the intuition behind it. 

1 II the two firms ate able to achieve the Nash bargaining solution, splitting profits so 
that each gels the same gam ovet its "threat point" ol forcing the Nash quantity equilib¬ 
rium. then each firm would-carn 4,()(i2'/v - F before the shock. After the shock firm B 
would earn 4,180% - F and A would earn 4,000% - F. Again A is hurt by the greater 
profitability of its monopoly market. 
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The simplest model of the strategic effect assumes that firm A is a 
monopolist in one market, market 1, and a duopolist with B in an¬ 
other market, market 2. 2 Firm A chooses strategic variables S A and 
S* A , and B simultaneously chooses Assume that a higher level 
chosen for this variable indicates more "aggressive” play. For ex¬ 
ample, if firms choose quantities or levels of advertising, then S A , S A , 
and 5 ® can be thought of as the quantities of output or amounts of 
advertising that the firms choose. If, however, firms choose prices 
then, because low prices are a sign of aggressive play, S A , S A , and 
can be thought of as the inverses of prices charged. Without loss of 
generality we can assume in market 1 that .S'i A = q A , because as a 
monopolist in market I firm A will implicitly be choosing its quantity 
there even if it thinks of its strategic variable as price or advertising. 
Demand is assumed to be independent across markets. 3 

Finally, we assume that there is a “shock" variable Z that affects the 
profitability of market 1. An increase in Z of one unit can be inter¬ 
preted as either shifting A’s marginal revenue curve (as a function of 
quantity) in market 1 upward by one unit or shifting its marginal cost 
curve downward by one unit. Equivalently, it may be interpreted as a 
decrease in excise taxes paid by A in market 1 or an increase in a per 
unit subsidy A receives in that market. 

R\ is the revenue of firm F in market t, assuming Z - 0 . and C h is 
the total tost of firm F, assuming Z = 0. Firm A earns profits of 
rr A (.S' [\ Si, Si, Z) = Kft;S'ft 4- r£(S£, S$) - C A (S J\ S£, Sf) + ZS? 
(because .V A = q A ). Firm B, because it competes only in market 2, 
earns tt b (.V a , ,V B ) = R “(,V A , 5“) - C B (.S -A , S“). If the profit functions 
are all differentiable, then there are three first-order conditions that 
must be satisfied at an interior Nash equilibrium. 


<5tt a 

_ dR , A 

ac A 

+ Z = 0 

(1) 


as , A 

as? 

drr A 

asf 

_ aRi 
as? 

ac A 

asi 

= 0 
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c)-n ,B 

asi 

_ aR.f 
as? 

dC» 
asi 

= 0. 

(3) 


To examine the effect of a shock that makes market 1 marginally 
more profitable, we totally differentiate the first-order conditions: 

2 In iiulow, Geanakoplos, and Klemperer (2983) we consider a more general model 
with two firms simultaneously or sequentially competing in each of two markets. The 
propositions there can lie generalized to many firms. 

* This assumption means that the only effect of S A on the equilibrium choices of S£ 
and S.f comes from interrelated costs. In our applications section we generalize to the 
case where demands, rather than costs, are interrelated. 
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These equations can be further simplified by noting that diT A ldZ 
= < 7 A . Therefore, d 2/ n A /dS A dZ = 1, since S A = q A , and fl 2 ir A /^S^5Z = 
dq^/BS? = 0. Equations (4), (5), and (6) can thus be summarized as 



We assume that the equilibrium is locally strictly stable, which im¬ 
plies that the determinant |ir| of the matrix, it, in (7) is negative and 
that, in the absence of market 1, market 2 would still be strictly stable, 
hence that 7 T 22 IT 33 > 1132^23 1 We also assume that the products are 
substitutes. That is, d-n'VdS.j' < 0 and d-n n /dS A < 0. 

Note that if d 2 -n A /dS A dS A = —d 2 C A /dS A dS A < 0 there are joint dis¬ 
economies, or diseconomies of scope, across markets (being more 
aggressive in one market and raising sales there lowers the marginal 
profits from being a little more aggressive in the other market), and if 
<9 2 ir A /3S A dS A > 0 there are joint economies/’ 

It is now possible to solve (7) for dS A /dZ, dS A /dZ, and dS^IdZ. file 
following results are easy to derive: 


1 Thai is, if we adjust S(\ -S. A , and A” near the Nash equilibrium according to the 

usual rule S ; K = <ht K /dSj K , K = A, R,j = 1,2 (i.e , if marginal revenue exceeds marginal 
cost, raise the corresponding strategic variable) then d/dl\(dTt‘'/dS*)'‘ ! + (Air'/dS.') 2 + 

(dir B /dS“)*] < 0. If the matrix ir is nonsingular, which generically it is, then 11 must be 
negative definite if the titonncment process is to be strictly stable. Hence |ir| < 0 and 

1T 22' Tr SS > 't T S2'* T 2S- 

5 Alternatively, even if costs were unrelated across markets, d' 2 v*/dS A aS* would gen¬ 
erally be nonzero if demands in the two markets were interrelated. When the strategic 
variables S |\ S* are quantities q*, q*. then our definition of joint economies has a 
natural interpretation in terms of technologically related marginal production costs 
across markets. When the strategic variables represent prices, then the costs of produc¬ 
tion must be multiplied by the induced changes in quantity. However, it is easy to show 
that the same technological relation between marginal production costs is again a 
proper interpretation. 
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1. dS A /dZ > 0: A positive shock to the marginal profitability of 
market 1 causes A to sell more there. 

2. sign(dSnldZ.) = sign (d~ it a /AV A AS' a ). We know from result 1 above 
that, with a positive shock, in equilibrium A will sell more in market 1. 
Whether this leads A to adopt a more aggressive or less aggressive 
strategy in market 2 depends on whether the markets exhibit joint 
economies (more aggressive) or joint diseconomies (less aggressive). 

3. sign{dS->/dZ) — «g?i [ (d 2 tt a /AS A AS y ) • (d 2 u“/ASy ASy)]. Whether 
firm Us equilibrium strategy is more or less aggressive depends on two 
things: («) whether there are joint economies or diseconomies across 
markets (by result 2 this determines whether A is more or less aggres¬ 
sive in market 2), and ( b ) whether a more aggressive strategy by A in 
market 2 (increased S%) raises or lowers B’s marginal profitability. 

Result 3 is the core of our paper}' 

Think of d~-n v ‘ldS*<iS'} as <t/AS. A (d'n' , VASy'). That is, the term repre¬ 
sents the change in the marginal profitability to firm B of being a bit 
more “aggressive” when firm A becomes more aggressive. In quantity 
competition, for example, this equals the change in firm B’s marginal 
revenue when firm A increases its quantity. d^n^/dSodSo can be of 
either sign, both in differentiated or undifferentiated products quan¬ 
tity competition and in differentiated products price competition. For 
example, with undifferentiated products quantity competition and 
constant elasticity demand, rf’tr l VAS’y AS A is negative for equilibria in 
which c/ 1 ' is small relative to c/ A but positive lor equilibria in which y" 
is sufficiently large relative to y A . If rr-Tr'VAS^AS A is negative, we say 
that B regards its product as a strategic substitute to A, and if 
<i‘ odS> 0 we say that B regards the products as strategic comple¬ 
ments. I'hus, with strategic substitutes B’s optimal response to more 
aggressive jtlav by A is to be less aggressive (B decreases S>f). With 
strategic complements B responds to more aggressive play with more 
aggressive play (increases .S’.?). 

With conventionally defined substitutes, Arr K /AS ? < 0: B earns less 
total profits if A adopts a more “aggressive” strategy. Similarly, with 
complements dir "/AS A > 0. With strategic substitutes and complements 
we are concerned with the effect on marginal profitability. In the 
numerical example in Section II the markets had joint diseconomies, 
so an increase in profitability in market 1 implied an equilibrium 
decrease in So — y A . With linear demand in a Cournot model, the 
decrease in q A caused an increase in firm B’s marginal revenue curve so 


Note that ihis result does not depend on the stability ot market ‘2 in isolation: The 
sign ol the strategic effect (which is the sign of /IS pit/.) is dependent only on the system 
as a whole being stable. The assumption that market 2 is stable is, however, needed in 
the analysis of sequential markets: Result I above is reversed il ir r /ir„ : , < ir., 2 iT a ,,. 
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Fit. Reaction curves in an oligopoly market with strategic substitutes 


d' 2 TT a /dS?dS£ < 0 (strategic substitutes). The net result is that B’s equi¬ 
librium output (5®) was increased by the profitability shock and A's 
profits were hurt. The sign of the strategic effect on A’s profits is 
summarized below. 



Joint Economies 

Joint Diseconomies 

Strategic substitutes 

+ 

- 

Strategic complements 

- 

+ 


We can also explain our numerical example, and illustrate the anal¬ 
ysis above, in terms of reaction curves. Figure 1 graphs the strategy of 
firm A in market 2 as a function of B’s strategy, and the 

strategy of B as a function of A’s strategy, S^S?). The Nash equilib¬ 
rium is at point N. Since the numerical example used linear demand 
and quantity competition, the products are strategic substitutes (when 
A’s equilibrium quantity is increased, B’s marginal profitability curve 
is shifted downward with linear demand and so B reduces quantity). 
Therefore both curves are downward sloping—if S* is reduced B’s 
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marginal profitability is increased and S-f will be raised, and vice 
versa. 7 

The demand shock in market 1, by increasing A’s opportunity cost 
of selling in market 2, shifts A’s reaction curve inward to -Sf (Sf). 
(Firm A increases its output in market 1, thus increasing its marginal 
costs in market 2 so that it will lower S A for any given S-f.) The new 
Nash equilibrium is at N'. Firm A’s strategic variable has decreased 
marginally, while B’s strategic variable has increased slightly. At equi¬ 
librium d-n^/dS? — 0, so that the slight change in S A has a negligible 
effect on A’s profits. But the small change in S-f has a first-order 
effect. In our numerical example, an increase in Sf of one unit re¬ 
duces A's profits by $1.00 times A's sales in market 2. 

If firm A had joint economies across markets 1 and 2, a positive 
demand shock in market 1 would push A’s market 2 reaction curve 
outward. In that case the new equilibrium would entail a slight de¬ 
crease in ,S'f, providing a strategic benefit to A. 

If (from B’s viewpoint) the products were strategic complements, 
B’s reaction curve would be upward sloping around N* The exact 
opposite results would then occur. With joint diseconomies both firms 
would be less aggressive so the strategic effect of the change in ,S'f on 
would be positive. With joint economies both firms would be more 
aggressive and the change in S-f would hurt firm A. <J 

Finally, note that the counterintuitive result of reduced total profits 
for A when market 1 is made more profitable is generally true if the 
strategic effect is negative and sales in market 1 are sufficiently small. 
The total effect of a profitability shock, AZ, on A’s profits is 


(AZ) 


d-n A 

~dZ 


= (AZ) 


/ <hr A dS j A 

l as i' <iz 


fhr A dS£ 

d.Si' dz 


!)■ tr A dSf 

a.S'f dZ 


+ 


0t7 A \ 
dZ ) 


( 8 ) 


7 Stability in market 2 it it operated in isolation, or subsequently to market 1, requires 
that on the axes ( hosen A's reaction curve V ?(S ”) be steeper than B’s. 

K The leader may 1 heck that it is not important whether A's reaction curve is upward 
or downward sloping provided that the equilibrium is stable 

*' Figure 1 illustrates how our results extend to markets connected on the demand 
side rather than (or as well as) on the cost side—see Set. VIA,'. T he crucial questions are, 
around equilibrium: ( 1 ) Does iricteasing A's activity in market I push A's market 2 
reaction curve out or in? (it) Does B’s market 2 reaction curve slope down or up? The 
sign of the strategic ellect on A's profits can then be determined as below: 


A's reaction 
curve pushed: 

B's reaction curve slopes. Out In 

Down + 

Up 


+ 
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The first-order conditions assure that the first two terms in the paren¬ 
theses on the right-hand side of (8) equal zero. The last term Stt A ISZ 
= qt, which approaches zero if demand is sufficiently small. There¬ 
fore, if q* is small enough, sign(rfrr A /dZ) = sign[(dw A /AS “)(<&’<'VcfZ)]. Of 
course, di^/SSj* < 0 as long as the products are conventional substi¬ 
tutes. Whenever dS^/dZ > 0, as is true with joint diseconomies and 
strategic substitutes or with joint economies and strategic comple¬ 
ments, the result in Section II will hold. 


The Effect on Entry 

The strategic effect is the effect on the strategies of competitors who 
are committed to competing in market 2. However, a profitability 
shock to firm A in market 1 also affects the behavior of potential 
entrants into market 2. 10 

The potential entrant B’s decision whether to enter a market de¬ 
pends on its total profits there (how aggressively it plays once it is in 
depends on its marginal profit). The sign of the effect on B's total 
profits (if it enters) depends on whether A plays more or less aggres¬ 
sively in market 2 after the shock, that is, on the sign of ( dS'fldZ). 
Result 2 above therefore shows that the sign of the effect on entry 
depends only on whether there are joint economies or diseconomies 
across the markets, and not on whether products are strategic substi¬ 
tutes or strategic complements. 

The change in As profits through the effect on entry into market 2 
of a shock in market 1 is summarized below. 


Joint Economies Joint Diseconomies 

Strategic substitutes + - 

Strategic complements + — 


IV. Sequential Markets 

The strategic effect is slightly different if A is able to precommit to its 
output in market 1 before A and B compete in market 2. In general, 
A will not set marginal revenue equal to marginal cost in market 1 
when it considers the strategic effect. 11 The most obvious examples of 


10 We model this formally as follows: (i) A is precommited to competing in markets 1 
and 2, (li) there is a profitability shock in market I, (iii) B decides on the basis ol the 
shock whether or not to enter market 2, and (iv) the markets clear simultaneoush. 

" We are defining marginal revenue and marginal costs here in the usual way. MR, 
= dTR t ldq] and AfC, = dTCZldq *, where MR, TR, MC, and TC represent marginal and 
total revenues and costs for firm F, and the subscript indexes markets. Thus the mar- 
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sequential markets are “learning curve” models, where there are joint 
economies so that increased production by A in period 1 (market 1) 
reduces its marginal costs in period 2 (market 2) and “natural re¬ 
sources" models where increased production in period 1 raises A’s 
marginal costs in period 2. 12 

To consider the sequential case we need make only one 
modification in our formal model. The first equilibrium condition, 
instead of being chrVflS A - 0, becomes d-n^/dS A = 0. Firm A chooses 
its market 1 strategy taking into account its total effect on profits, 
including its influence on S®. The implications of having sequential 
markets cart he readily seen just from examining that first-order con¬ 
dition: 

<7-it a dTT A , r)ir A dS? t rJ-TT A dS-2 , dir A dZ _ n /fn 

Tsf ~ as '.] 4 dsTsJ’dsJ HTTs? ~ ( " 

We know from the second first-order condition, equation (2), that 
f)iT A /cl.S'a V = 0. Also dZ/dS A = 0, so the total ef fect on profits of an 
increase in S A is marginal revenue less marginal cost (rhr A /dS A ) plus a 
strategic term. The value of the strategic term (dir A /r)S ItlS A ) can 

lie found by differentiating and solving the first-order equations (2) 
and (3) simultaneously, as before. cFir'V&S ® is negative provided that 
products are (conventionally defined) substitutes so that the sign of 
the strategic term = — sign(rf.S’®/rf„S' A ) = - sign[(<3 2 -rc a /AS' a AV A ) ■ 
(a?Tr tt /as$dS?)]. 

With joint economies, as in the learning curve case, if there are 
strategic substitutes then the strategic term is positive so Ait a /AS' a will 
be negative; that is, A will choose .S' A at a higher level than the point 
where marginal revenue equals the marginal cost of increasing 3 A . 
An obvious implication is that A may produce in market 1 even if 
total costs exceed total revenues in that market. The same results 
would hold with joint diseconomies and strategic complements. 

If we had a negative strategic effect—joint diseconomies and 
strategic substitutes or joint economies and strategic complements— 

gmal tost in market 1 is the Umg-iun marginal lost and coriectly anticipates market 2 
output We are considering a “perfect” Nash equilibrium ol the game m which A 
[mi on uni! s to S ' and then A and B simultaneously choose S 2 ' and S7 he equilibrium 
(S', S', S-!*) is "perfect" in that (S'j> ") is a Nash equilibrium o( the single market game 
pjayed in market 2, holdings' S ' We suppose that, lor [he matrix it, evaluated at 
(.V .S', S?) instead of (.S' '. .S ', S"). it is still the cast that ir 2:i iru_> 7tv(Tr iv It follows that 

for S£ near S ' we can solve uniquely for S' and S-!' near (S'. S?) By the implicit 
I unction theorem we know that the functions S'(S') and ■S.J'fSi') are ditfereniialile 
when S ' = S '. 

12 Mote usually with sequential matkets. both firms will be in both markets (i.c., in 
(Kith periods) We consider the case in which only A is in market I in order to simplify 
the analysis The qualitative effect that the existence of market 2 has on A’s market 1 
strategy is unaffected by the present e of B in market I (see Bulow el al. 1988). 



MULTIMARKET OLIGOPOLY 


499 

the exact opposite results would hold. A firm may stay out of a market 
even though there are no fixed costs and marginal revenue exceeds 
marginal cost for the first few units. In the numerical example in 
Section II, firm A would have stayed out of market 1 if that market 
cleared prior to market 2. 

Note that with sequential markets (but not simultaneous markets) a 
firm cannot be hurt by the existence of a profitable market that clears 
first. Because A precommits to a level of S A , a small positive shock, 
AZ, will raise profits by exactly q A AZ. In sequential markets A may 
take an apparently unprofitable opportunity because of its strategic 
implications. This cannot happen in the simultaneous markets equi¬ 
librium because B anticipates that dir A /dS A = 0 and A cannot gain by 
doing otherwise. 

Our analysis can readily be applied to any decision A might make at 
one time that would affect its marginal profitability at a later time. We 
can reinterpret 6' A as any strategic variable that affects future mar¬ 
ginal profitability. For example, S A might be investment in sunk costs 
in period 1 that reduces marginal costs in period 2. Then with 
strategic substitutes the strategic effect of investment is to make B 
play less aggressively in period 2 so that A will overinvest in fixed 
costs. With strategic complements, however, A will underinvest in fixed 
costs. This contrasts with the qualitative implications of papers that 
focus exclusively on the use of "excess capacity” to deter entry. We 
continue this discussion in Section VIA. 

Note again the distinction between the effect of A's actions on a 
potential competitor’s entry decision and the effect on the aggres¬ 
siveness of the competitor contingent on its entry. A greater invest¬ 
ment by A in period 1 will cause it to produce more in market 2 and 
therefore lowers B’s total profits, so that reductions in A’s marginal 
costs always have the strategic benefit of making entry less profitable 
for B. However, if B regards the products as strategic complements 
then, contingent on deciding to enter, B will compete more aggres¬ 
sively the more A has invested. 

V. Quantity versus Price Competition 

Thus far we have modeled competition as the choice of an abstract 
strategic variable S. In this section we specialize our analysis to two 
familiar models of oligopolistic competition: homogeneous products 
quantity competition and differentiated products price competition. 

A. Quantity Competition 

In quantity competition firm i chooses s, = q„ the number of units to 
be sold in the market. If there are n firms, then (with undifferentiated 
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products) the market price is a function of industry quantity/(£j( = ^q k ), 
and firm i 's costs are £<( 9 ,). The condition for strategic complements is 


d^TT, 

dq,dq, 


/'(j^9*) + ?/"(X 9 *) > 0> * * J- 


k - 1 


( 10 ) 


We note that total revenues for the firm are 9 ,/'(S* = i <?*)• Thus the 
slope of firm t’s marginal revenue curve is 


d 2 lq/(ZU 1 9 »)] 




(ID 


The difference between (10) and (11) is simply/'(£*= 1 9 *), the slope 
of the demand curve. Thus (10) can be rewritten as 


(slope of marginal revenue curve) > (slope ol demand curve) ( 10 ') 

or that the demand curve is steeper titan the marginal revenue curve. 
Of course with linear demand and quantity competition the firm al¬ 
ways regards its marginal revenue curve as twice as steep as its de¬ 
mand curve and therefore regards the products as strategic substi¬ 
tutes. 

If industry marginal revenue is decreasing in total output, then 
(with undifferentiated products) only a firm producing more than 
half the total market output can regard competitors’ outputs as 
strategic complements . 13 Thus, a large firm in an industry may regard 
products as strategic complements while its competitive fringe re¬ 
gards them as strategic substitutes (the reverse result is impossible). 
With a constant elasticity demand curve and a small enough fringe 
the dominant firm will always regard the products as strategic com¬ 
plements because its marginal revenue curve will be flatter than its 
demand curve. Consider the constant elasticity inverse demand p = 
(-*-, 9*)'“- () < a < 1 , at an equilibrium q\, 92 , ... , 9 ,, in which 
9 *)% < «—that is, firm 1 is the dominant firm and the other firms rep¬ 
resent a sufficiently small fringe. The reader can calculate that firm 1 
regards products as strategic complements {d^-njSq^q, > 0 ,j 5 ^ ;) while 
all other firms regard the products as strategic substitutes. 

This dominant firm result has two interesting applications. First, it 
provides a setting in which the dominant firm expands in response to 
a fringe incursion simply because the dominant firm is a Cournot- 
N’ash player; we do not need to rely on asymmetric information or the 


11 With undifferentiated products quantity competition, a him will either regard all 
its competitors’ products as strategic substitutes or regard all its competitors' products 
as strategic complements, because the firm's marginal revenue curve is only a function 
ol competitors' combined output. 
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desire of the dominant firm to establish a “reputation” in repeated 
play. Second, the dominant firm can credibly build capacity that it will 
use if it is faced with competition even though the capacity will sit idle 
if the competition does not arise. The firm may thus hold excess 
capacity to deter entry, in contrast to Dixit (1980) and Spulber (1981); 
see Section VIZ). 


B. Price Competition 

We now consider the case where the strategic variables are prices, 11 
and first suppose there are constant marginal costs of production. If 
demand for B’s product is downward sloping given a fixed price p A 
for A’s product, then at a profit-maximizing price B is setting p R [ 1 + 
(1/t))] = MR = MC, where t, = {[d</ B (/> A , /> B )]/ty B } (p B /t/ B ) is B’s elastic¬ 
ity of demand with respect to B’s own output, given p A . When A raises 
price (thus raising B’s quantity), B will adjust so that this relation 
again holds. With constant marginal cost it is clear that whether B 
regards the products as strategic substitutes or complements depends 
strictly on elasticity—if demand becomes more inelastic when p A is 
raised, then B will respond by raising p B , and we have strategic com¬ 
plements. If B’s demand becomes more elastic at the original p u when 
A raises price, then B will respond by cutting/> B and we have strategic 
substitutes. 

With increasing or decreasing marginal costs both sides of the MR 
= MC equation are affected by A’s price increase. With increasing 
marginal costs, even if elasticity is held constant A’s price rise will 
cause B to raise price. With decreasing marginal costs and constant 
elasticity, B will cut price when A charges more (strategic substitutes). 
With demand of the form B = /(/» A ) + g{pv),f > 0, g' < 0, whether 
B regards the products as strategic substitutes or complements de¬ 
pends on whether its demand curve is steeper (strategic comple¬ 
ments) or less steep (strategic substitutes) than its marginal cost curve. 
This condition determines whether the increase in B’s quantity 
caused by a price increase by A decreases B’s marginal revenue by 
more (strategic complements) or less (strategic substitutes) than it has 
changed marginal cost. With linear demand and increasing marginal 
cost, for example, B will always regard the goods as strategic comple¬ 
ments. 


14 By price competition we mean a game in which firms set prices and then must sell 
all that is demanded at that price. We describe below how prite competition tan be 
modeled within the framework of Sec. Ill: For firm A, let tt a (</ a , pi, p £) = ft A ( q*) + 
ptf A (p 3 , p •?) - A ,/ A (p A , pa 1 )), where/ A is the quantity A sells in market 2 when A 
and B charge prices p§ and p*. respectively, and where C is the technologically given 
cost function depending on the quantities q 1 and = / A (p A ./>“). We can write tt a (S a , 
5 A , S= •ir A (S A , 1/Sa , 1/S”) and then employ our more general analysis. 
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VI. Applications 

A. Strategic Underinvestment in Fixed Costs 

As we noted in Section IV, selling units in one market in order to 
reduce marginal costs in a second market is formally equivalent to 
investing in capital that will directly lower the marginal production 
costs in the second market. With decreasing marginal costs a firm may 
sell units at a loss in the first market in order to prevent entry in the 
second, if the markets are sequential. This is equivalent to the familiar 
result that a firm may overinvest in capital in the first period of a 2- 
period game—that is, invest beyond the point where an extra dollar’s 
investment in period 1 saves a dollar's expenses in period 2—in order 
to reduce its marginal cost and hence deter entry in the second pe¬ 
riod. 1 " 1 

If, however, the firm cannot prevent entry in the second period, 
then if we have decreasing costs and strategic complements the firm 
may under invest in period 1 —stop investing at a point where an extra 
dollar’s investment would save more than a dollar’s expenses in pe¬ 
riod 2—because with strategic complements the strategic effect of 
reducing marginal costs is to make opponents compete more aggres¬ 
sively in period 2. 

Suppose that firm A produces output from a neoclassical, conslant- 
returns-to-scale production function <y = f(K, /.). Assume that A can 
install capital in period 1 at a price ? and that in period 2 firm A can 
hire labor at price re and immediately produce output. In period 2 
firm A and another firm B, with known marginal cost, compete by 
announcing quantities or prices S? and .S'“ for the produced goods. If 
B regards the products as strategic substitutes (fl^Tr’VdS^dS” < 0), then 
A’s equilibrium choice of K and l. satisfies K/L > efficient K/L. so 
there is “overinvestment,” but if B regards the products as strategic 
complements, then K/L < efficient K/L, and there is underinvest¬ 
ment. 

Thus, for example, with price competition and linear demand, the 
more a firm invests, the lower an entrant’s price will be, because 
greater investment lowers the incumbent’s expected price. A firm 
may have an “entry deterrence” incentive to overinvest, but if it can¬ 
not deter entry then it has a “price war avoidance” incentive to hold 
back and underinvest. Similarly, if transportation firms compete on 


It may l>e possible for the hrm to inc rease investment and lower total variable cost 
but still raise marginal cost in the relevant range. This would reverse all the results in 
this section; e.g,, such investment would make potential entrants assume that the hrm 
would compete less aggressively and would encourage potential entry, so that the firm 
would have an incentive to underinvest to deter entry. 
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the quality of their facilities, then “raising the stakes” by investing in 
expensive modern equipment may make entry less likely, but if entry 
does occur (and the firms’ products are strategic complements), the 
entrant may buy more modern equipment as well. A firm that cannot 
deter entry may underinvest. 

These results contrast with the work of Spence (1977), Dixit (1979, 
1980), and others, who focus exclusively on the use of excess capacity 
to deter entry. 

B. Royalties and License Fees 

Kamien and Tauman (1983) have studied the problem of an inventor 
who is selling the rights to a technology that would shift downward 
the marginal cost curve of the licensees in an oligopolistic industry. 
Would the inventor earn more money by charging each firm a royalty 
per unit produced or a fixed fee per firm? 

With strategic substitutes firms will pay more than the direct savings 
to obtain a lower marginal cost, because their lower costs will cause 
competitors to compete less aggressively. I'he inventor is better off 
charging a flat fee because his customers will pay a premium for low 
marginal costs. With strategic complements, however, lower marginal 
costs for a firm cause its competitors to adopt more aggressive strate¬ 
gies so that licensees will bid less than their direct saving for the use of 
the innovation. In this case the inventor can eliminate the harmful 
strategic effect by charging a royalty fee equal to the per unit savings 
in production, so that firms' marginal costs are unaffected by the 
innovation. 

C. Dumping in International Trade 

The broadest definition of dumping includes any case where a firm 
price discriminates between two markets. A narrower definition, and 
the one that concerns us, covers situations in which a firm sells in a 
market to a point where marginal revenue is less than marginal cost. 
The strategic effect provides two explanations of dumping. 

First, if a firm is selling only in a foreign market, and if the strategic 
variables are strategic substitutes, then a subsidy to the firm will in¬ 
crease its profits by more than the subsidy. Thus a government may 
subsidize a firm to “dump” its products at low prices in a foreign 
market. 16 Similarly, with joint diseconomies and strategic substitutes 
there are strategic reasons to impose a tax (such as a rebatable VAT) 


ie This point has also been made independently by Eaton and Grossman (1983), 
Brander and Spencer (1984), Dixit (1984), and Krugman (1984). 
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on domestic sales of a domestic monopolist but not on exports, rather 
than charge a lower tax on all production. These strategic benefits 
may exceed the welfare loss resulting from reduced domestic sales. 17 

Second, in sequential markets firms may produce unprofitably in 
one period to gain the strategic benefit of making competitors less 
aggressive in future periods. For example, if a Japanese firm has 
decreasing costs over time (as in an industry with a “learning curve”) 
and its American competitor’s product is a strategic substitute, the 
firm may “dump” output in the early stages of a market’s develop¬ 
ment to encourage the competitor to either contract operations or 
withdraw from the market. 


D. Holding Idle Capacity In Deter Entry 

Building extra capacity converts marginal costs into fixed costs and so 
raises a firm’s output. Therefore, such actions may deter entry. How¬ 
ever, beyond a certain point additional capacity will not deter entry 
further. Capacity deters competitors only if they believe the capacity 
will be used after entry. Thus the most extra capacity a firm will build 
is the amount it will actually use if entry occurs. Dixit (1980) and 
Spulber (1981) (effectively assuming strategic substitutes) concluded 
that if it would be profitable to use all capacity after competition 
enters, then it surely would be profitable to use all capacity if no entry 
occurs. Hence no capacity would be built and subsequently left idle. 

With strategic complements and quantity competition, however, a 
firm will want to supply less if it remains a monopolist than if competi¬ 
tors produce. Consequently, capacity can be built that the firm could 
credibly threaten to use in the event of entry but that would be left 
idle if entry was deterred. Firms anticipating price competition with 
strategic complements may also rationally install idle capacity to deter 
entry. We give further details in Bulow, Geanakoplos, and Klemperer 
(1985). 

E. Is a Little Bit »j Competition a Good Thing? 

In this section we restrict ourselves to quantity competition in homo¬ 
geneous products. Consider an entrant into a monopoly market, 
where the entrant has constant marginal costs just below the pre-entry 


17 The policy, tax or subsidy, on the (home sales, exports) ol a domestic monopolist 
that maximizes the value ot the strategic ettecl is as follows: 



Joint Economies 

Joint Diseconomies 

Strategic substitutes 

(Subsidy, subsidy) 

(Tax, subsidy) 

Strategic complements 

(Tax, tax) 

(Subsidy, tax) 
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monopoly price. Does this “little bit of competition” increase or de¬ 
crease welfare? 

Entry is a good thing with strategic complements and is welfare 
reducing with strategic substitutes. Because the entrant produces a 
small amount at a cost almost equal to the social value of its output, 
the entrant has only a second-order direct effect on welfare. How¬ 
ever, the entry will cause changes in the incumbent’s output, and 
those changes do have a first-order effect on surplus, equal to the 
change in the incumbent’s output times the price minus its marginal 
cost. With strategic complements the incumbent responds to entry by 
increasing output (and thus welfare); with strategic substitutes the 
incumbent’s output, and thus welfare, is decreased. 

F. Rational Retaliation as a Barrier to Entry 

A special case of entry deterrence occurs when two monopolists are 
potential entrants into each other’s markets. It may be rational for a 
firm A to retaliate against entry into its market by a second firm B but 
not enter against B otherwise. 

There are two reasons. First, if B faces diseconomies of scope its 
expansion into a second market makes it generally more vulnerable to 
entry. Second, B’s entry will change A’s equilibrium output in the 
market where it is the incumbent and therefore possibly alter its deci¬ 
sion of whether to enter B’s market. For example, if there are joint 
diseconomies across markets, then an attack that reduced A’s home 
market output would also raise the marginal profitability of its pro¬ 
ducing in B’s market. 

The story that one firm might do best it; avoid another’s “territory" 
for fear of retaliation is not unfamiliar. The point we are making here 
is that neither does the equilibrium in which each company avoids the 
other’s territory depend on tacit collusion, nor is the threat of retalia¬ 
tion one that would be costly to carry out. Initially, it might be costly 
for A to enter B’s market even if it were not worried about retaliation 
itself. Thus no tacit collusion is necessary to restrain it from expand¬ 
ing. However, if B enters A’s market then it may be profitable for A to 
retaliate. So the threat that deters B’s expansion is a credible one. 1 * 4 


18 The precise game we are describing has three stages. First B announces whether it 
will enter A’s market, then A announces whether it will enter B's market, then the 
simultaneous market game is played We consider only perfect equilibria. The first 
reason for retaliation, that B's higher marginal costs make its market more attractive to 
potential entrants, was disc ussed in the latter part of Set III The second reason, that 
even if B's marginal costs are unaffected A finds entering B's market more attractive, is 
illustrated by the following numerical example: 

A is initially a monopolist in market I; B is the incumbent monopolist in market 2 tat h 
firm faces a fixed cost of 750 of competing in each market. The inveise demand in 
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G, Limit Overpricing 

The limit-pricing literature suggests that a monopolist may price 
lower than the pre-entry profit-maximizing price in order to signal 
low costs and deter entry (see, e.g., Milgrom and Roberts 1982). How¬ 
ever, an important assumption in this literature is that any entrant 
learns the incumbent’s costs immediately after entry. If this assump¬ 
tion is relaxed, then, if entry occurs and with strategic complements, 
the incumbent would like the entrant to believe that its costs are 
higher than they really are. Thus the incumbent might, in principle, 
charge a higher first-period price than the monopoly price. Although 
it has an “entry deterrence” incentive to price low and signal low costs, 
it has a “price war avoidance” incentive to price high and signal high 
costs, which helps it if entry does occur. Again the basic point is that 
with strategic complements a firm’s competitors play less aggressively 
if the firm’s costs are perceived to be higher. 

H. The Learning Curve 

The learning curve literature discusses the problem of firms that 
compete in sequential markets and experience joint economies. An 
important issue is whether the sequential (or closed loop) equilibrium 
in such a game is much different from the simultaneous (open loop) 
equilibrium in which firms ignore the impact of their first-period 
decisions on competitors’ second-period strategies. 

In his seminal paper, Spence (1981) compared the simultaneous 
and sequential equilibria in a two-period problem with two firms pro¬ 
ducing in each period in which industry demand had a constant elas¬ 
ticity of — 1.25 and there was no spillover or diffusion of knowledge 
between firms. He concluded that, while firms’ first-period outputs 
would be greater in the sequential case, the differences from the 


market t is p, = 100 - - q J*. Firm B ran supply all its markets as much as it wauls at a 

((instant mat filial mst of zero. Firm A has the capacity to produce tiO units at a constant 
marginal cost of zero but has a very high marginal cost (say 100) ol supplying additional 
units beyond fit) II A and B stay m their respective markets, each will produce 50 units 
and sell at a pme of 50, earning net profits of 50 X 50 — 750 = 1,750 each. A initially 
has no incentive to invade B’s market. If it did, B would produce 40 in the contested 
market. A would produce 20 in the contested market and 40 in the unconiesred mar¬ 
ket, earning 1,700. If, howevei. B enters A s market, A has the choice of remaining in 
its home market only and selling 33'A lor a net profit of 561 'At, or retaliating by 
entering Bs market If both firms are in bolh markets, the ciiuilibrium is that B 
produces 35 til each market (at a price of 35) and A produces 30 in each. B’s profits are 
050 and A's profits are 600. Thus A would react to B's entry by entering B's market but 
would not have entered otlieiwise. B, which apparently had an incentive in invade A's 
market (its profits would have risen from 1,750 to 2,11 PA it A had not retaliated), 
would have done better to confine itself to one market. 
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simultaneous case (which is much easier to solve in many-period mod¬ 
els) were very small. Fudenberg and Tirole (1983), analyzing linear 
demand, achieved the same qualitative result as Spence but argued 
that the differences in the equilibria between the simultaneous and 
sequential analyses were important. By discussing the interrelation¬ 
ship of markets through strategic substitutes and complements, we 
can clarify this issue. 

Spence’s small quantitative differences in equilibria were an artifact 
of his choosing a constant elasticity demand curve with elasticity near 
— 1. The reader can confirm that with constant elasticity of — 1 (only 
necessary for the second period) and symmetrical duopoly quantity 
competition, d 2 ir A /dS^dS^ = d^Ti^/dS? dS? = 0, so the strategic term is 
neutral. In this case there will be no difference between the simul¬ 
taneous and sequential solutions. 

With linear demand and quantity competition, products are 
strategic substitutes and first-period output is higher in the sequential 
game. However, there is no reason to assume that a real market with 
quantity competition would exhibit strategic substitutability. Strategic 
complementarity gives all decreasing cost firms the incentive to pro¬ 
duce less in the sequential equilibrium, reversing the results of Spence 
(1981) and Fudenberg and Tirole (1983). 

With two firms in two markets and price competition, the situation 
is more complicated. 19 Firm A’s price in the first period not only 
affects its own costs in the second period but, by affecting B’s first- 
period sales, affects B’s second-period costs. (In quantity competition 
A’s choice of quantity does not affect B’s first-period quantity and so 
cannot affect B’s second-period costs.) 20 In calculating the strategic 
effect of its action in the first period, A must consider its impact on B’s 
second-period reaction curve as well as on A’s. Strategic substitutes 
price competition will always give lower first-period prices in the se¬ 
quential equilibrium than in the simultaneous equilibrium. Strategic 
complements price competition can lead to either higher or lower 
first-period prices in a sequential equilibrium than in a simultaneous 
equilibrium. However, with linear demand and symmetrical firms, a 
lower price for A in period 1 will always imply a higher price for B in 
period 2 (because of B’s higher costs), and because of this favorable 
strategic effect firms will charge a lower price (and so produce more) 
in the sequential equilibrium. Also, Fudenberg and Tirole (1985) 


Wc analyze this case precisely in Bulow et al (1983). 

,i0 An exception is when the learning curve has “spillovers” so that diffusion of 
knowledge allows all firms to team from any one firm's production. (Lieberman [ 19821 
gives empirical evidence for the importance of spillovers ) Wc discuss this case further 
in Billow et al. (1983). 
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show that with price competition in a spatial location model, learning 
by doing still gives firms a strategic incentive to produce more. 


I. Natural Resource Markets 

Natural resource models are the mirror image of learning curve mod¬ 
els: They are games played over time (i.e., sequentially) where greater 
output in period 1 will increase marginal cost in period 2. That is, 
firms’ production costs will generally rise with cumulative output as 
“digging deeper” is required for extraction. In this case, with strategic 
substitutes, a tiatural resource oligopolist will produce less in the first 
period and more in the second than if both periods’ strategic variables 
were chosen simultaneously, because of the negative strategic effect. 
With strategic complements the firms will produce beyond the point 
where MR = MC in the first period because of the positive strategic 
effect. 

Bulow and Geanakoplos (1988) show that with strategic substitutes 
a firm may choose to produce some high-priced “backstop” reserves 
in the first period, even if some cheaper reserves that will never be 
used are available for the purpose. The reason is that the strategic 
benefit of leaving cheap reserves in the ground and lowering second- 
period marginal costs exceeds the cost of producing inefficiently. 


/. Product Portfolio Selection 

The most obvious application of our results is to the theory of how a 
firm should select a “portfolio” of businesses in which to compete. 
I he strategic effect on an old market of producing in a new market 
must be considered. 

As a possible example of a firm that ignored the strategic effect of 
diversification, consider the case of Frontier Airlines. In the early 
1980s the firm expanded beyond its original Denver hub to capitalize 
on some apparently profitable opportunities. Many feel the airline 
made a tactical error. After Frontier spread itself over several new 
markets, other airlines began to compete much more aggressively for 
shares of the Denver market. Some oflhis new competition may have 
been inevitable in a changing, deregulated environment, but some of 
it was probably due to a perceived weakness on Frontier’s part that 
arose from its being “spread thin." 21 


11 See “Where Frontier Lost Its Way,” Business Week, February 7, 1983, p. 120. 
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K. Markets Where Demands, Rather than Costs, Are 
Interrelated 

Throughout our analysis we assumed that demands across markets 
were independent so that the term d 2 ’n A /dS A dS‘ A depended solely on 
whether A had joint economies or diseconomies in production. We 
could just as easily have assumed that demands were interrelated. 1 * 2 
With independent costs, d 2 ir A /dS A dS> A is positive if a firm’s demand in 
one market is complementary to its demand in the second (equivalent 
on the cost side to joint economies) and would be negative if selling 
more in one market hurts prospects in the other. Firms must coj 
the cross-effects in making marginal revenue calculations 
sider the strategic effects of their actions in one market q^competi- 
tors’ actions in a second. 2 * 

VII. Conclusion Y/' 

M . 

I his research began as an investigation into how a change mt-taae" 
market can have ramifications in a second market, even if the de¬ 
mands in the two markets are unrelated. We found that a critical issue 
in determining the nature of the interaction was whether competitors 
regarded products as strategic substitutes or strategic complements. 
In other words, would a more aggressive strategy by one firm in a 
market elicit an aggressive response from its competitors, or would an 
aggressive move be met with accommodation (competitors playing 
less aggressively than previously)? 

This same distinction turns out to be critical in many other 
oligopoly models. Whether firms overinvest or underinvest in capital 
relative to the efficient level for production, whether innovations are 
most profitably sold for fixed fees or licensed for royalties, whether 
governments maximize national income by taxing or by subsidizing 
exports, whether firms have incentive to diversify into apparently 
unprofitable markets or to shun apparently profitable opportunities, 
and whether firms produce more or less when markets operate simul¬ 
taneously than when they operate sequentially, all depend on whether 


' ,n! See, e.g., Judd (1983), who analyzes how selling goods dial are substitutes alfeils a 
multiproduct firm’s ability to deter entry, and Klemperer (1984), who examines mar¬ 
kets in which consumers’ costs ot switching between brands of a piodutt make it easier 
for a firm to sell to consumers who purchased from it in a previous period (market). 
Other examples ot markets with interrelated demands are the market for children’s 
television shows and the toy market, and the markets for small and mid-sized cars. 

If demands are connected, firms must of course also account for any direct effects 
of their actions in one market on competitors’ behavior in another market. 
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competition is with strategic substitutes or with strategic com¬ 
plements. 

We cannot determine whether products are strategic substitutes or 
strategic complements without empirically analyzing a market. Both 
modes of competition are compatible with both price and quantity 
competition. In the case of quantity competition, the mode of compe¬ 
tition depends only on the market demand. With price competition, 
strategic substitutes competition is most likely with increasing mar¬ 
ginal costs and least likely with decreasing marginal costs, but the 
shape of the demand curve is again critical. Thus assumptions that 
are innocuous in most models of monopoly and atomistic competi¬ 
tion, for example, whether demand is locally linear or of constant 
elasticity, are of crucial importance in the oligopoly context. If an 
oligopoly model assumes, say, linear demand and quantity competi¬ 
tion, the real economic assumption may be that products are strategic 
substitutes. A local change in the curvature of demand might give 
strategic complements and reverse the results. 

It has long been suspected that any result in oligopoly theory, or its 
converse, can be generated by an appropriate choice of assumptions. 
Strategic substitutes and complements help explain this basic ambi¬ 
guity and so focus on a critical distinction. When thinking about 
oligopoly markets the crucial question may not be, Do these markets 
exhibit price competition or quantity competition or competition us¬ 
ing some other strategic variable? but rather. Do competitors think of 
the products as strategic substitutes or as strategic complements? 
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What happens to the wealth of shareholders of firms producing 
defective products? Our answer—for producers of drugs and autos 
that were recalled from the market—is that the shareholders bear 
large losses. They are substantially greater than the costs directly 
emanating from the recall—for example, costs of destroying or re¬ 
pairing defective products. In fact, they are plausibly larger than all 
the costs attributable specifically to the recalled product; the losses 
spill over to the firm's “goodwill." They also spill over to competitors, 
rim negative externality may even be larger in the aggregate than 
the losses to the producer of the recalled product. 


I. Introduction 

This paper has a simple goal; to estimate the losses borne by owners 
of a firm that recalls a defective product from the market. While we 
stick close to the “facts," we hope they will shed some light on an 
important issue in consumer protection regulation. This is the extent 
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to which information about product quality is sufficient to deter pro¬ 
duction of faulty products. We focus on two products—drugs and 
autos—where extensive regulation of product quality occurs before 
marketing of the product. One rationale for such premarket regula¬ 
tion would be that mere disclosure of any defects after a good is 
marketed does not impose sufficient costs on the seller to deter opti¬ 
mally the production of defective products. Suboptimal deterrence 
could occur if, for example, consumers were insufficiently sophis¬ 
ticated in assimilating information about defects or the tort liability 
system insufficiently compensated them for resulting damages. 

While we do not address these normative issues directly, we hope 
that our results will be useful in assessing the magnitude of the poten¬ 
tial problem. Accordingly, we compare our estimates of losses to own¬ 
ers with independent estimates of some elements of direct costs to 
firms of recalling defective products. These would include the costs of 
destroying contaminated batches of drugs, the costs of repairing de¬ 
fective cars, and so on. An obvious question—and a test of capital 
market efficiency—would be whether the capital market internalizes 
these costs. If it fails to do so, any presumption of suboptimal deter¬ 
rence would be strengthened: some cases involve potentially large 
indirect costs for consumers—for example, health damages from a 
dangerous drug—and for these cases, optimal deterrence would re¬ 
quire a penalty greater than the direct costs we are able to estimate. 

We chose to focus on recalls of automobiles and drugs (prescrip¬ 
tion, over-the-counter, and medical devices) because each yields a 
good-sized sample of recalls and because we could obtain associated 
data on some elements of the direct costs of most of these recalls. The 
products also differ in an interesting dimension: drug recalls occur 
much less frequently (per firm) than auto recalls. Important examples 
of the latter occur every few weeks or months, while the former occur 
once or twice in a decade. 

Our primary finding is that the capital market in fact penalizes 
producers of both recalled drugs and autos far more than the direct 
costs. Indeed, the capital market penalty seems so great that it may 
even exceed a plausible independent estimate of the relevant social 
costs. We do not press this point, because we have only the most 
fragmentary data on the relevant indirect costs. But to the empirical 
question. How much deterrence does the capital market provide 
against the sale of faulty products? the answer implied by our data 
must be, “Considerable." 

We also find that competitors of drug and auto firms with recalled 
products are not helped by their rival's travail. In fact, in both cases 
they bear substantial losses. 

The next section sketches the theoretical link between a regulatory 
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event, like a recall, and the capital market’s response to it.' The two 
following sections describe this response to recalls of drug and auto 
products, respectively. 


II. How “Should” the Stock Market React to 
News of a Product Recall? 

The extent to which the stock market reacts to some event that entails 
a cost to shareholders depends on how well the event is anticipated. 
Thus, wage payments impose costs on stockholders, but stock prices 
do not decline on payday. Recalls do not occur with quite the same 
predictable regularity as wage payments, but neither are they a com¬ 
plete surprise. In such cases, the stock market response to the event 
will understate the total costs borne by stockholders. To see this, let 
any uncertainty be resolved within a “month,” and suppose that only 
one of two things can happen to a firm next month: either a product 
is recalled at some cost (A’) to shareholders or there is no recall. So, the 
firm’s month-end stock price (.S',) will be either 

S?“ = V, (1) 

il no recall occurs, or 

.*(* - V - K. (2) 

if a recall occurs, wheie V' = present value of the firm’s profits, in- 
<luding all expected recall costs except those occurring next month, 
and where we assume independence of successive monthly events. 
The firm’s stock price at the beginning of the month is the present 
value of future profits, or 

.S’u = p(V - A) + (I - p)V = V - pK , (3) 

where p = probability that a recall occurs next month. Thus, if a 
recall occurs next month, the stock price will change by (2) — (3) 
above, or 

S[< - .V„ = -(I - f>)K. (4) 

that is, by the unexpected component (1 — p) of the recall cost. Only if 
the recall is entirely unexpected (p = 0) will (4) = A. In months where 
recalls do not occur, stockholders get a capital gain of (1) — (3), or 

•S'i v " - ,S’l* = pK. (5) 


Set* Srhwcn (1981) lor a fuller ire.unienf 
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So, to get the full loss to stockholders, we would need to divide (4) by 
(1 — p) or subtract (5) from (4). 

In practice, (4) and (4) — (5) will be about equal if p is small. This is 
the case with drugs where no company in our sample has been in¬ 
volved in more than two distinct recalls in nearly a decade. For this 
sample, then, we use conventional “event study” methodology, more 
fully described below, in which we in effect estimate just (4). But for 
the more frequent auto recalls we also provide estimates of (4) -t- (1 — 
p) and (4) - (5). 


III. The Stock Market Response to Drug Recalls 

A. Drug Recall Sample 

When a drug product is found to be defective, the manufacturer is 
required to remove it from the market. This recall can be initiated by 
either the manufacturer or the Food and Drug Administration 
(FDA), and it can involve anything from a few bottles of contaminated 
or mislabeled product to the permanent removal of a product from 
the marketplace. Several hundred recalls occur in a typical year, and 
most involve minor health or financial consequences. Our sample 
comes from those recalls reported in the trade press. 

Specifically, we consulted the weekly reports of FDA Recalls and 
Court Actions in the Food, Drug and Cosmetic Reporter , an industry 
newsletter commonly called the “Pink Sheets.” Recalls were included 
in our sample if the Pink Sheets report gives an estimate either of the 
direct costs of the recall or, more commonly, of the number of units 
recalled. We also include those recalls where direct cost estimates are 
reported in the Wall Street Journal (WSJ). Our sample period runs 
from 1974 through 1982. We exclude cases without slock returns data 
for the manufacturer. 

Our sample overrepresents large and hazardous recalls (because 
these are always reported in either the Pink Sheets or WSJ). For 
example, the FDA uses a three-level classification of recalls in which 
Class I recalls involve the most serious potential health hazard. Class 1 
recalls account for over half of our sample, while Block (1980) reports 
that they account for less than 2 percent of the over 3.000 FDA recall 
citations issued between 1973 and 1978. Many of our cases received 
considerable publicity. Over half were covered by the Wall Street Jour¬ 
nal. Five of these cases were serious enough so that the recalled prod¬ 
ucts were withdrawn indefinitely from the market. Table 1, which is 
elaborated below, summarizes this sample. It shows the names of the 
manufacturers of the recalled drugs in our sample, the event dates 
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10-1 Milton Roy (N) 7/15/76 .05 3.64 12 

10.2 Milton Roy 8/25/76 .05 - 198 12 

11. Morton Norwich 11/23/79 .06 -2 75 250 



Johnson (Ortho) 10/13/75 .13 2.65 6,500 

Parke Davis 8/13/76 .04 -.63 1,000 

Procter-Gamble (Rely)t 9/18/80 2 46 -5.29 150,000 

Richardson 10/1/78 1.68 -8 18 11,500 
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and estimates of the stock, market response, and direct costs incurred 
by these manufacturers as a resuit of the recalls. 

B. Choosing Event Dates 

For each recall, we sought to identify the earliest date at which news 
of a recall might have become public. For most of the cases reported 
by the WSJ. this is the date that the recall begins. Sometimes, however, 
the first hint of a recall appears in a story preceding the recall date— 
for example, a story implicating a drug in a serious health problem. 
In these cases, we use the date of the earliest WSJ story on the trou¬ 
bled product. For cases covered only in the Pink Sheets, our event 
dale is the earliest date on which the FDA notified the manufacturer 
to recall the product." 

Sometimes news about essentially the same product delect is spread 
out over time. For example, two defective batches of a product are 
found several weeks apart (cases 2.1 and 2.2) or a product defect is 
found a month belore the firm decides that a recall is necessary (26.1 
and 26.2). We treated these related episodes as separate events (and 
split direct costs evenly among them) if more than 3 weeks elapsed 
between the events. These are identified by case numbers with deti- 
mals in the table. (We treat related events less than 3 weeks apart, like 
the rest, as a single event beginning on the earliest date of adverse 
news.) 


( Duett Costs of Retails 

For most recalls, we estimate the “direct cost” bv assuming that all of 
the defective units become worthless on recall. Specifically, where the 
Pink Sheets report the number of units of the recalled batch that are 
m distribution channels, we multiply this figure by the wholesale price 
of the product as reported in the appropriate yearly issue of the 
Ament an Druggist's Blue Book and the Ding Topics Red Book to estimate 
“direct costs.” 

For some of the more publicised recalls (cases 6, 7, 14, and 26), 
direct cost estimates were available from news stories, because the 
companies took an extraordinary charge to their income. For in¬ 
stance, the WSJ reported on October 29, 1982, that it would cost 
Johnson and Johnson about $50 million to recall and destroy 22 mil¬ 
lion units of Extra-Strength Tylenol capsules. It also reported that 


* Sometimes the Pink Sheet story implies that the initial FDA communication is 
private For these cases, we use the puhlitation date of the Pink Sheet story—usually a 
week or so later. 
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new tamper-proof packaging, additional television advertising, and 
related efforts to rebuild consumer confidence would cost another 
$50 million. Therefore, our estimated direct cost to Johnson and 
Johnson of the Tylenol recall is $100 million. 

We make no allowance for tax benefits due to recall costs. Where we 
use reported extraordinary charges, we use the pretax figure, and we 
ignore any tax savings from inventory losses. Accordingly, our direct 
cost estimates may be overgenerous. 

For each recall, table 1 gives the estimated direct cost in dollars (last 
column) and as a percentage of the market value (just before recall) 
of the respective manufacturer’s common stock. In both dollar and 
percentage terms, the recall of Procter and Gamble’s Rely tampon 
(14) entails the largest estimated direct cost ($150 million, 2.5 percent 
of market value) in our sample, while the Class I recall of Abbott 
Laboratories’ Plasmatein (1) is the least costly ($5,000, 0.0005 per¬ 
cent). 

D. Capital Market Returns 

The full cost to manufacturers of recalled drugs is measured by net- 
of-market (or excess) stock returns in the period surrounding public 
announcement of the recall. These excess returns are obtained from 
the Scholes excess return file at the University of Chicago’s Center for 
Research in Security Prices (CRSP). 4 We cumulate excess returns for 
each manufacturer over several “event windows" of different inter¬ 
vals to allow for pre-event leakage or postevent revision. The narrow¬ 
est event window is 6 days, from t = - 2 to t = 3, where t = 0 is the 
formal event date of the recall. The widest event window is from t = 
-49 to t = 50.' 1 

Table 2 presents mean cumulative excess returns (CER) f or various 
event windows. These are negative for every window from 1 week to 5 
months around the event date. But the 2-week window, CER( —4, 5), 
yields a loss roughly within a percentage point of that for any wider 


s For rases 3, 4, 8.1, and H.2, steak icturns are unavailable from this source. So we 
constructed excess return series lor these cases by subtracting the return to the New 
York Stock Exchange Index Irotn returns to these firms' stocks. 

4 Some of the wider event windows result in overlap ol the related events denoted by 
decimal case numbers in table I. In these cases, we (I) arbitrarily split the time between 
events in half and attributed the excess return for any day to the event closest in tune 
and (2) set the remaining excess returns to zero. For example, cases 26.1 and 26.2 occur 
22 trading days apart. Excess returns for the tirsl 11 days after May 29, 1974, are 
attributed to 26.1, and all subsequent excess returns are set equal to zero for that case 
Excess returns for the 11 days ending June 28, 1974. are attributed to 26.2, and all 
preceding excess returns are set equal to zero for that case. In this way, we avoid double 
counting of the same excess return. 
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TABLE2 

Mfans anl> Dispersion Measures FOR CER in Drug Recall Firms 
(Various Intervals), Dikec'i Cost and CER to Drug Portfolio 


Variable Name* 

Mean 

m 

/ (Mean)t 

Percentage of 

All CKRs 
Negative 

K%)t 

CF.R( - 49, 50) 

-ti 742 

— 2.17 

62.5 

1.46 

CF.R(-29, 30) 

— H.479 

- 2.27 

71.9 

2.76 

CF.R( —14, 15) 

-7 147 

- 3,63 

71.9 

2.76 

CF.R(-9. 10) 

— 6 563 

-4,71 

84.4 

5.36 

CER( - 4, 5) 

-6 132 

-6.23 

90.6 

7.87 

CER( - 2, 3) 

- 2.832 

-4.07 

84.4 

5.36 

CER( - 4, 0) 

2.36 

-3.72 

71.9 

2.76 

CER(I. 5) 

- 3.78 

-5.05 

81.3 

4.54 

UI)RU(>( - 3. 5) 
Relative direct 

- 1 170 

-3 49 

81 3 

4.54 

COM 

5.34 

3.05 




* < f-K( - X >) is thr< imiuUmc cx<r\s rcitun (horn S< holes’*, Returns I'.rpe. I'liivernty <»l (Jnt.igo, (.RSI*) 

bom \ ir.ulmg tLn*. Im-Iimc u> ) iciding d«ivs rfltfr the ic<«tl! event HI)Ki'(>( -4 5) is the (Uimikimc excels return to 
.m rtju.jl-vn jgbtrd poi tlolio of ,»|) Y Sf or ASF thug iimiiiiIji Hirers )i<iviug,iri SIT of 2HS1. 2840, or 284 ! (about 5b 
buns) 1 hr < iimuljim ext ess i rturn to ibis ding jwntfnho ismmpijted Jjoro/ = -4 to/ = r > tin r<*<h d<itro» which 
ilieie was .i ding it i all ib.it is included in i>ui sample The ri i ng Jinn submit to the i era 11 is ext bided from the drug 
|>oii(<ilio when computing BDRIV. lor ea< h p.iriuiil.n mall event Relative chieu tost is die estimated duett loss 
exptoxrtl as a pen outage ol the market value ol die c<|tHt\ <d the u-tall lit in 40 trading (lavs lx*tnir the tec all event 
date 

+ Ratio ol mean in its standaid euot I he stand.nd cnoi nl the mean t.KR( - .X, > > is computed as 



N 


when err is the vaitanit ol the nh mail firm s excess sunk leuun and S' = 12 letalls of is estimated Ini each him 
hv using tlatlv ex<« ss i etui ns hnnu - -4‘»to/ = -5 and/ - »ic>f = :»(l Let sj lx the variant c of the atnive-deftned 
tunc sei ies ol dads ext ess letntns M>« n a J , is i omputed hv multiplying .V* by /. wheie f is the number of trading 
davs in the |mmuiiI.vi event window / is KHoi CfRi- I. 5). 20 for (.FR< - 9. 10). and soon Ihesc standard ei tors 
au vntuails identical to (In' standaid eitots ol dn sample mean ( f Ks , 

$ fiat turn of all < i Ks negative iimins 0 r i divided In standard eiror fiom h<n<mti.if distribution S E = (PQ/.V) 
wheie p pmpmtmtiol ( f Ks negative. <> = <1 - p) 

window. This means that essentially all of the market response to the 
event is compressed into the 2 surrounding weeks. In addition, there 
are no systematic “mistakes”—that is. there is no systematic recovery 
of some of these losses in the 50 days after the event date, or else the 
CER( — 49, 50) would be smaller than CER( - 4, 5). Finally, fully nine- 
tenths of the sample suffers a loss in the 2 weeks surrounding a recall. 
( This proportion is indistinguishable from 0.5 in all the other 2-week 
subperiods.) So there can be little doubt that recalls constitute adverse 
news for stockholders and that most of the uncertainty about them is 
resolved in the 2 weeks surrounding public disclosure ol' the recall. 

Table 2 splits this 2-week CER into its prerecall, CER( — 4, 0), and 
postrecall, CF.R(1, 5), components. Both are negative. So, if our event 
date is the earliest date of public information, these data imply some 
prior leakage of public information. 
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Perhaps the most striking result in the table is the magnitude of the 
capital losses due to recalls. In particular, they are much larger than 
our generous estimate of direct costs. The mean CER( — 4, 5) is — 6.13 
percent, which is fully 12 times the mean relative direct cost of 0.53 
percent (and over 50 times the median). We never fully succeed in 
explaining this enormous gap. 

Table 2 also shows the CER(-4, 5) for an equally weighted port¬ 
folio of drug firms not involved in the recall. Our motive here is to see 
if competitors benefit from the adversity visited on the seller of the 
recalled product. Instead, the spillover seems negative. All other drug 
stocks suffer a (significant) mean loss of just over 1 percent in the 2 
weeks surrounding a recall. This cannot be explained by any ten¬ 
dency for recalls to be bunched (in which case one recall would beget 
expectations of others). 5 

E. Direct Costs and Capital Market Losses • 

The large difference between the capital market losses and our esti¬ 
mate of direct costs led us to see if the capital losses are related to the 
degree of publicity surrounding the recall or to whether there was a 
complete product withdrawal. These may be proxies for costs that we 
cannot estimate. For example, a withdrawal may engender losses to 
specific assets (e.g., research and development, past advertising) that 
are not written off. For the 14 recalls in our sample that neither were 
covered by the WSJ nor involved a withdrawal, the mean CER( — 4, 5) 
is -3.76 percent, while it is -6.36 percent for the 13 nonwithdrawal 
cases covered by the WSJ. The mean CER for the five withdrawals (all 
were covered by the WSJ) is - 12.18 percent. So the crude data imply 
that both extra publicity and withdrawal are costly. But the remaining 
cases still entail an enormous discrepancy between the capital loss and 
direct costs. 

The relationship between capital market losses and direct costs is 
shown more formally in table 3. Part A contains regressions of CERs 
on direct costs and dummies for publicity (= 1 if there was a WSf 
story) and withdrawal and the CER( —4, 5) to the portfolio of other 
drug firms. This last variable is not really exogenous, given the previ¬ 
ously documented spillover ef fect of recalls. But we include it to ac¬ 
count crudely for the industry-specific component of the total loss (as 

’’ We have 26 unrelated events 111 the 9 years 1974—62, or about three per veat. !i 
retails were being generated by a Poisson process with a mean of (26/9) per scat, the 
standard deviation would be 1.7. This differs insignificantly ftom the sample standard 
deviation of 1.45, so the distribution of recalls seems essentially 1 andom. These data, oi 
course, could hide some more complicated inlet dependent ies—eg., one reiall could 
signal an increase in all firms' quality control expenditures. 
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well as “other” industry-specific news). These regressions confirm the 
tendency for both publicity and product withdrawal to be costly, 
though some of the standard errors are large enough to caution 
against pushing these conclusions too hard. The main new result is 
the negative coefficient of the relative direct cost variable: it says that 
an extra dollar of direct cost adds $2.00—$4.00 to the stockholders’ 
loss. This implies that even our generous estimates of direct losses are 
systematically low but correlated positively with the “true” cost of a 
recall. 

Part B of table 3 tells us that direct costs are higher for publicized 
recalls and for withdrawals. The relevant coefficients are statistically 
weak, but they are large relative to the mean direct cost. This implies 
that part of the extra costs of publicity and product withdrawals 
shown in table 3 are due to the tendency for these recalls to have 
larger direct costs. 

The larger message suggested by both the crude data and table 3 is 
that stockholder losses from recalls go beyond costs that can be attrib¬ 
uted to the specific product. Part C of table 3 shows that the CER to 
competitors is much more weakly related to the case-specific variables 
than is the recall firm’s CER. This means that any recall, regardless of 
"size,” engenders a roughly similar industry-wide asset loss. Further, 
even after allowing for a multiple of direct costs (as in part A of table 
3), we do not come close to rationalizing the 6 percent average loss of 
a recall. That is, the regressions imply that an unpublicized recall that 
does not result in a withdrawal and has trivial direct costs still entails a 
loss of over 3 percent, based on CER( — 4, 5). 

We have so far not dealt explicitly with one potentially important 
product-specific cost: expenses for product liability suits. But these 
cannot amount to much for a case involving a small defective batch of 
an otherwise safe product. We suspect that the major impact of prod¬ 
uct liability costs is showing up in the large coef ficient of direct costs 
and in the extra losses due to product withdrawals. Every withdrawal 
in our sample has engendered well-publicized product liability suits. 

For one of these we have a long profile of product liability costs. 
Though samples of one yield notoriously noisy estimates, it seems 
worth exploiting these data to get a sense of the likely magnitude of 
this specific cost. T he case involves the Daikon Shield, an intrauterine 
birth control device that was implicated in the deaths of some users. 
The two events in our sample (26.1 and 26.2) emanating from this 
product withdrawal generated CER(-4, 5) values of - 18 and - 11 
percent, or a total loss of around $150 million to the manufacturer, 
A. H. Robins. Robins took a pretax charge in 1974 of $5.1 million for 
the costs directly related to withdrawing the product and destroying 
inventory, and these are shown in table 1. T he company also agreed 
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with the SEC to break out all expenses (extra legal fees and uninsured 
liability payments) related to litigation over this product in its financial 
statements. It has done this in every annual report from 1976 to date. 
The total of the pretax charges reported for 1976-82 is $29 million, 
or a 1974 present value of $17 million using a 10 percent annual 
discount rate. A simple regression of the log of the annual elements 
of this expense stream against lime implies a mean increase in these 
expenses of 21 percent (S.E. = 11 percent) per year. We then as¬ 
sumed that expenses would continue to be incurred for another 5 
years and would equal the predicted values from this regression in 
each year from 1983 through 1987. These assumptions imply an 
additional $21 million of liability costs in 1974 present value, bringing 
the total to $38 million. 

This exercise tells us that, in (partial) hindsight, a reasonably com¬ 
plete independent estimate of the full product-specific costs of the 
recall to Robins is on the order of under one-third of the stock market 
loss. (Since Robins has had an average tax rate of over 40 percent in 
recent years, even this is too high.) So, if the product liability compo¬ 
nent of this cost is anything like the consumer cost of the product 
defect, the stock market loss would exceed the “social loss.” While we 
hesitate to push these fragmentary data this far, 0 they, like the pre¬ 
ceding data, show how substantially the stock market losses can ex¬ 
ceed those costs specifically attributable to the recall of a specific drug. 

Another way of putting this is that the stock market is imposing a 
substantial goodwill loss on a firm over and above the product-specific 
costs. The stock market appears to expect that news of a recall will 
reduce consumers’ demand (or raise costs) for other products sold by 
the firm and thereby impose additional losses on the firm. We tested 
this conjecture by adding the market value of the firm to the regres¬ 
sions in table 3. A single product typically accounts for a smaller 
fraction of a firm’s profits the larger the firm. Thus, the percentage 
loss due to recall of a single product should be smaller for larger 
firms, if there is no spillover to other products. But the coefficient of 
the firm’s market value was never as much as one-tenth of its standard 
error, and this implies that losses do spill over to the firm’s other 
products (just as they seem to spill over to other firms in the same 
industry). 

This goodwill element of the recall loss poses a challenge to further 


** A fuller treatment would require us to see if announcement of the liability costs 
affected the returns to Robins's stock. For example, if the initial reaction overestimated 
these costs, subsequent announcements of the actual costs would engender positive 
excess returns 
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research, because there seems to be no easily apprehensible basis for 
expecting one product failure to beget others. As nearly as we can tell 
from our recall data, product failures occur randomly. However, 
whatever their source, it seems clear that the costs to drug firms of a 
recall are so large that they must exert a powerful deterrent effect on 
the production of defective products. 

IV. Auto Recalls 

Since the late 1960s, the National Highway Traffic Safety Administra¬ 
tion of the Department of Transportation (DO T) has been em¬ 
powered to order manufacturers to recall and repair autos with de¬ 
fects that compromise safety. We use a sample of 116 “major recalls” 
that occurred in 1967—81 to analyze the stock market's response to 
the news of this form of product defect. Our analysis recognizes a 
problem we raised in Section II: auto recalls occur too frequently to 
be entirely surprising to the stock market, so the market’s response to 
the news of a recall can understate the full costs it imposes on produc¬ 
ers of recalled cars. 

A. Recall Sample 

Each recall is initiated by an order from DOT specifying which partic¬ 
ular group of cars are to be recalled and what is to be done to fix the 
cars. The distribution of the number of cars per recall is highly 
skewed. Some involve a few hundred cars or even less, and a few 
involve millions of cars. Our sample is designed to exclude many 
obviously trivial cases while retaining enough variety to permit analy¬ 
sis of the effects of recall size. It is drawn from all recall announce¬ 
ments reported in the WSJ involving the domestic Big 3 (CM, Ford, 
and Chrysler) for 1967—81 that exceeded 50,000 cars for CM, 20,000 
for Ford, and 10,000 for Chrysler. 7 These cutoffs are crudely consis¬ 
tent with the relative market shares (and stock market values) of these 
firms, and they result in roughly equal representation of each firm in 
our sample. The sample is described more precisely in table 4. Even 
after excising the small recalls, there is a very broad range of recalls in 
our sample. Our sample remains highly skewed to the tight; every 
relevant coefficient of variation comfortably exceeds one. Chrysler 
has the smallest recalls, GM the biggest, but these ranks are reversed 
when recalls are measured relative to market value. 


7 We have no stock market data for foreign producers, and American Motors has too 
few recalls to permit reliable comparisons with the others. 
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B. Stock Market Response to Auto Recalls 

For each recall in our sample we computed CERs for various periods 
around the event date—the date of the WSJ story about the recall. We 
used the same source and procedure as for drug recalls. The basic 
results are in panel 1 of table 5: Average CERs are significantly nega¬ 
tive for every event window, and the average gets larger absolutely as 
the windows widen. We did not go beyond the 2-week window, 
CER( —5, 5), because recalls are so numerous that much wider win¬ 
dows would have created serious overlap problems. 8 That window 
yields a mean CER of — 1.60 percent. About half this total is realized 
in the 3 days surrounding the event (—1, 1), another one-third in the 
subsequent 4 days, CER(2, 5), with the one-fifth or so remaining 
leaking out prior to the day before the event. Also, there is a 
significantly above-average frequency of" negative recalls for every 
window, though these do not begin to approach the near unanimity in 
the corresponding data for drugs. 

Panels A, B, and C of table 5 break out results by company. Every 
firm suffers a negative average CER and an above-average frequency 


TABLE 5 


Mean CER kjk Acm Sion 
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Imervai.s 
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Mean 

-1.60 

- 96 

-.81 

-1.07 

- .53 

t 

3.40 
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3.30 

2 85 

1.87 

Percentage negative 

61.2* 

62.1 * 

64.7* 

62.1* 

60.3* 

A. General Motors (-11): 






Mean 

-.97 

- 80 
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- 49 
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1.64 

1.70 
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1 02 

1 38 

Periemage negative 

56.1 
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56.1 
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B. Ford (44): 






Mean 

- 2.03 

- 1.58 

- 63 

-1.51 

-.52 
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3.50 

3 42 

2.08 

3 26 

1.49 

Percentage negative 

63.6 

68.2* 

61.4 

63 6 

54.5 

C. Chrysler (SI): 






Mean 

- 1.83 

-.28 

- 98 

- I 24 

- 59 

t 

1.37 

.26 

1.40 

1.16 

.73 

Percentage negative 

64.5 

58.1 

67.7* 
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74 2* 

Note.— Srt* text loi dust npt urn of 

sample, anil sec 

note* to table 

2 for ihhImkI ni comp 

mlinfj I 



* = t>20 (see note to lablc 2) 


H As it is, four of our ! 16 cases overlap. We lef t the overlaps in out sample, blit no 
result would change very much if the overlapping cases were deleted 01 if we had made 
the same adjustments as for drug-recall overlaps. 
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of negative CERs for every event window. That unanimity tends to 
support a conclusion that recalls are costly, even though many of the 
individual statistics in panels A—C are not significant. The rather wide 
standard errors on some of these make us cautious about pushing 
comparisons among firms too hard, but it appears that GM loses 
about half as much per recall as either of its competitors, based on 
CER( — 5,5). This difference is due mainly to GM’s smaller recalls per 
dollar of market value (see tables 4 and 7). 

1. Does the CER Understate the Cost of a Recall? 

Our discussion in Section II implies that the CER for recall periods is 
an estimate of — (1 — p) x A', where K = the cost of a recall to a 
company and p = probability of a recall. So one way to estimate K 
would be to estimate p directly and divide the CER by (1 — p). To see 
where such a procedure would lead, note that every company in our 
sample experienced an average of 2—3 major recalls per year in the 
1967—81 period, or about 1 in every 10 2-week periods. If uncertainty 
is resolved within 2 weeks, p ~ .1, and this implies an estimated 
average loss due to a recall (K) of around 1.8 percent of market value 
rather than the 1.6 percent in table 5. 

However, this procedure would be biased if “other” news eluting 
recall periods was systematically favorable or unfavorable on average. 
That possibility needs to be taken seriously for the auto industry in 
1967-81, a period in which adverse effects of foreign competition, 
pollution regulation, and so on cannot have failed to affect the indus¬ 
try’s stock market performance. If nonrecall surprises in 1967-81 
were indeed systematically unfavorable for auto stocks, then (1) the 
mean CER for nonrecall periods would be less than pK, the capital 
gain due to absence of a recall, and (2) the mean loss in recall periods 
would exceed — (1 — p)K. But, if the average effect of adverse “other" 
news is the same in recall and nonrecall periods, we can still estimate 
K by subtracting the mean CER in recall periods from the mean CER 
in nonrecall periods.'* This procedure is implemented in table 6 in the 
column labeled “adjusted mean CER”: For each year we compute the 
mean CER( —5, 5) for every nonrecall period for each of the three 
firms. 10 Then we subtract this year- and company-specific nonrecall 
mean CER from the CER( —5, 5) for each recall experienced by the 
company in the same year. For ease of comparison, we repeat the 


“ Call this common average effect of adverse news ( — X). Then, the nonrecall period 
mean CF.R is (pK - X), the recall period mean CER is [-(1 - p)K - X], and the 
difference between them is just K. 

10 More precisely, we compute 11 x mean daily ER for nonrecall periods. 
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TABLE 6 

Mean CERs for Auto Recalls: Adjusted for Nonkecai.l News: 
1967-81 and Subperiods 


Sample and 

Number of 

Recalls 

(in Parentheses) 

Adjusted Mean 

CER( - 5, 5) 

Unadjusted 
Mean CER(-5, 

. - r >) 

Mean (%) 

t 

Mean (%) 

t 

L. All 1967-81 (116): 

- 1.38 

2.83 

- 1.60 

3.40 

A. General Motors (41) 

-.60 

91 

-.97 

1 64 

B. Ford (44) 

- 2.05 

3.51 

-2.03 

3.50 

C. Chrysler (31) 

- 1.46 

1.06 

- 1.83 

1.37 

2. All 1967-74 (53): 

-.60 

77 

- 55 

.73 

A GM (18) 

- .56 

.57 

- 57 

.63 

B. Ford (17) 

-.44 

.68 

- .44 

.60 

C. Chrysler (18) 

-.81 

.40 

-.64 

33 

3. All 1975-81 (63): 

- 2 04 

3.35 

-2.48 

4 19 

A. GM (23) 

-.64 

.71 

- 1.28 

1.45 

B. Ford (27) 

-3.07 

3.88 

- 3 02 

3 79 

C. Chrysler (13) 

-2.37 

1.29 

-3.48 

1.99 


Noil —See* lexi for description of adjusted mean Cl R( - '», ft) l iudjusted mean ( KR( - ». >> is touipnted as in 
table 5 


unadjusted CER( — 5, 5) from table 5, and we provide the added detail 
of a subperiod breakdown. 

None of the results in table 5 are much affected by our adjustment. 
The adjusted mean CER(-5, 5) for 1967-81 remains significantly 
negative. It is a bit (0.2 percent) smaller than the unadjusted mean, 
but some of this may be due to a conservative bias stemming f rom 
subsequently documented negative spillovers of recalls. 11 The main 
innovation in table 6 is in the subperiod data of panels 2 and 8, not in 
how the CERs are calculated. These reveal a sharp difference in the 
impact of recalls between periods. The average recall costs less than 1 
percent of market value before 1975 (for every firm) regardless of 
how the CER is measured, and it costs more than 2 percent after 
1975. This difference is mainly attributable to Ford and Chrysler 
whose average recall period CERs in this post-1975 period range 
from around —2.5 to —3.5 percent. 

The next section, however, shows that the substantial difference 
between the stock market response to pre- and post-1975 recalls is 
more apparent than real. 


11 So, e.g., Ford's non-recall-period mean CER is capturing negative spillovers fiom 
GM recalls. This means that “other” news, including news of other companies’ recalls, is 
more adverse on average during nonrecall periods than in recall periods. 
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2. The Relation between Stock Market 
Losses and Recall Size 

We cannot compare the direct cost of auto recalls with the stock 
market loss, as we did for drugs, because we have no estimates of the 
former. We know only how many cars were involved in each recall 
and, from fragmentary press reports, that the firms' estimates of their 
repair costs per car range widely f rom something like $10 to $1,000. 
Perhaps because of the ‘'measurement error” entailed by this wide 
range, efforts to extract the per car stock market cost by regressing 
CLRs on the number of cars (per dollar of market value) across recalls 
proved singularly unrewarding. 12 

It proved more rewarding to aggregate over subgroups of recalls 
and thereby iron out some of the variability across recalls in the cost 
per car. fable 7 summarizes (1081) dollar losses per recall ( = 
CER[-5, 5| x market value/GNP deflator) and per car for various 
groups of recalls. While we report a mean loss per car, we have little 
confidence that the high dollar amounts are meaningful. These 
means are dominated by a few extremely small recalls that generate 
extremely large losses per car. Accordingly, we show two other mea¬ 
sures less affected by these extreme values—the median and the 
mean dollar loss per mean number of tars in a recall (labeled mean 
per mean). 11 This last datum is equivalent to aggregate losses in a 
sample divided by aggregate cais, so it comes closest to summarizing 
the experience of these firms over long periods. What is perhaps most 
interesting about this figure is its stability over time and between 
companies: in any large sample of' recalls, the loss per tar seems to be 
around $200. 

How much of the $200 figure is attributable to direct costs and how 
much to lost goodwill? The skimpy data suggest that, as with drugs, 
the latter dominates. If there were no indirect costs, the $200 figure 
would imply pretax direct costs of around $400 per tar (since direct 
costs are a deductible expense). This would be in the high end of the 
range of per car costs that have appeared in press reports about 


'^The itgicssion roof fulfills were usually vviilim one standard emu of /out and 
often ol the wrong sign. We tried some crude adjustment for differences in iespouse 
rates (car owners often do not lespond to recall notices) and tepair cost across a 
suhsatuple of lecalls Specifically, we allowed the regression coefficient to depend on 
the response rate and a dummy equal to one if the recall order mandated a repair 
procedure for dll recalled cars (and ecjual to zero if lepdir wds required only if inspec¬ 
tion revealed a defect), Neither variable shatpened our result. 

n To illustrate the problem entailed by veiy small recalls, remember that losses and 
cars aie essentially uncorrelated So, suppose losses in a I-idi recall and a i 00-car recall 
are eac h $100. The mean loss per car = '/^(I00/I + 100/100) = $50.50, but the total 
loss from both recalls is only 200 = $ 1.98 per car. This last figure is our mean per mean 
for this sample 
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TABLE 7 

Estimated Dollar Losses per Recall and per Car, 1987-81 
(Constant 1981 Dollars) 


Loss/Cah 


Loss/Recall ($) 

(Million $) ——• -— 


Sample 

(No. of Recalls) 

Mean 

■w / 

t 

Mean 

/ 

Median 

Mean/ 

Mean 

1. All 1967-81 (116) 

141.1 

2.2 

813.3 

1.5 

185.7 

196.6 

A GM (41) 

235.5 

1.4 

477.5 

.6 

46.6 

189.2 

B. Ford (44) 

128 6 

3.7 

694.2 

2.3 

198.2 

226.7 

C. Chrysler (31) 

34.0 

1.0 

1,426.4 

.9 

95.7 

144.9 

2. All 1967-74 (53) 

110.1 

.9 

1,092 9 

1.0 

64.7 

179.7 

3. All 1975-81 (63) 

167.2 

2.7 

578 1 

2.0 

189.0 

207.4 


Noi*. —l.oss/m all is estimated by nwliiplymR Ct K( - . r >. 5) foi caih rcc all pci lod by ihr market value of ibe hrm 
in that (XMiod Mean is the me.in of (loss/rccalfKats involved in the ictall The last loluimi (mean/mean) is 

obtained by dividing the mean of loss/retall. as shown in the first < olumn. by die nie.iu of iais/re<«ilJ from table 4 
bath loss/ret all is debated by the t.NP deflator set to a base of l‘IHl •* I 0 


specific recalls. 1,1 We know of only one publicly available piece of data 
that permits an estimate of the cost per car in a large sample of recalls: 
GM disclosed that it spent $33 million on recalls in 1982 (Detroit Free 
Press, May 22, 1983). This amounts to about $35 per GM car recalled 
that year. If this is anywhere close to being typical, then the bulk of 
the stock market loss represents indirect costs: lost sales, liability suits, 
and so on. 1 '’ 16 

Table 7 also sheds light on the very large discrepancy between pre- 
and post-1975 CERs. The first column (lines 2 and 3) shows a much 


11 For example, a Detroit Free Press series on recalls slates lhai a 1983 recall 01240,000 
GM cars “is thought to be the most expensive pei-car recall ever.'* GM's estimate ot its 
total direct cost for the recall is $30 million, or $125 per car (Detroit Free Press, May 24, 
1983). 

,r> In this connection, Crahon, Holler, and Reilly (1981) and Reilly and Mofler 
(1983) show that sales of recalled models appear to decline when the retail is an¬ 
nounced. 1 he latter article estimates that sales of a domestic recalled “line” declined 
about 5 percent in the month ol a recall announcement in the 1977-81 period, but 
there is no indication that the decline lasted more than a month A single-month sales 
decline of this magnitude could not account for very much of the typical stock market 
loss. There are about 60 domestic “lines" with average monthly sales of around 10,000 
cars. Reilly and Hotfer exclude lines with fewer than 8,000 cars per month. If the 
average line in the sample has 20,000 monthly sales, a 5 percent decline represents 
1,000 cars, or roughly $10 million sales. The lost pretax profits on these sales would 
amount to under $2 million, based on the industry’s margin of sales over material and 
labor costs. 

u> We also found no serial correlation in recalls For example, the correlation of the 
number of recalls in successive 3-month periods is .03 for Chrysler. .08 for Ford and 
GM, and .17 for the aggregate of all three firms. None of these is significant; auto 
recalls, like drug recalls, appear to occur randomly. 
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smaller discrepancy when the loss per recall is converted to constant 
dollars, and the discrepancy in the last column (loss per car) is smaller 
still. So, most of the discrepancy in CERs is due simply to the decline 
in the real value of auto stocks in the late 1970s: it took a bigger 
percentage of a smaller market value to generate the same dollar loss 
per recall. Most of the remaining discrepancy is attributable to the 
larger size of post-1975 recalls (see table 4). 

3. How Are Competitors Affected 
by Auto Recalls? 

The result that competitors lose rather than gain during a drug recall 
holds for auto recalls as well. And, as with drugs, the spillover effects 
are substantial. The data are summarized in table 8. The upper-left 
corner shows mean CERs to equal-weighted “portfolios” of the two 
competitors during recall periods (e.g.. a Chrysler-Ford portfolio 
during CM recalls). On average, competitors lose about 1 percent 
during a 2-week recall period, or about two-thirds as much as the 
recall company loses. All of this is attributable to 1975-81 recalls, 
where the competitors’ loss ( — 2.40 percent) virtually matches the 
recall company’s loss. 17 The remainder of the table provides the com¬ 
pany detail behind these averages in terms of (1) all competitors’ 
response to a specific company’s recalls (upper right), (2) a specific 
competitor’s response to both its rivals’ recalls (lower left), and (3) a 
specific competitor's response to a specific rival (lower right). With 
due respect given to the large standard errors, this detail reveals a 
considerable heterogeneity in the spillover. For example: (a) GM loses 
more during its rivals’ recalls (1.59 percent) than it does during its 
own recalls (0.97 percent); the reverse is true for both Chrysler and 
Ford. ( b ) The most damaging recalls for competitors are GM and 
Ford recalls, particularly in 1975—81 (-2.61 percent and -2.77 per¬ 
cent, respectively), (c) By contrast, the relatively small Chrysler recalls 
cost rivals about half (— 1.25 percent) as much as GM and Ford recalls 
and only about one-third what they cost Chrysler itself in this 1975— 
81 period. So Chrysler recalls seem to be treated mainly as idiosyn¬ 
crasies without strong implications for industry wealth. 

The large size of the spillover relative to the company-specific CER 


17 This difference between subperiods is less intelligible than the similar sort of 
difference we found tor recall company CERs (see table 6). In that case, we saw that the 
apparently weak negative CERs for 1967-74 were plausibly masking negative real 
dollar losses roughly comparable to those in the later period. In (able H we find similarly 
weak but positive CERs for competitors in 1967-74. These would be consistent with 
nontrivial real dollar gains to competitors, a result that would excite no surprise. But 
the post-1974 data clearly describe a much different world. 
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raises the question of whether there is any company-specific ef f ect of 
recalls at all. Or does a recall engender only an industry-wide loss, 
which is shared equally by the three firms? Table 8 gives a hint: there 
is no general tendency for companies to respond identically to their 
own recalls and those of competitors. A more formal answer is given 
by regressing a recall company’s CER( - 5, 5) on the CER to the port¬ 
folio of its competitors during recall periods. The regression 
coefficient here gives the company’s average share in any industry¬ 
wide effects of recalls, and the intercept gives the average company- 
specific component. We computed the regression for each of the 
three companies and obtained a mean intercept of —1.12 percent 
(t = 2.98 ). lH So there is a significant company-specific component to 
recall losses over and above a company’s share in the industry-wide 
loss. 19 


V. Summary 

Drug and auto recalls have strikingly similar effects on the wealth of 
shareholders. Both are much more costly than the direct costs of 
recalling the defective product. In both types of recalls a more gen¬ 
eral loss of goodwill seems to be a large component of the total loss. 
This result is not unique to this form of regulation. One of us 
(Peltzman 1981) has found similarly large goodwill losses for FTC 
false advertising cases. Just what lies behind these goodwill losses 
remains something of a mystery, which we leave for f uture research. 
Our attempts, mainly with drugs, to find answers in costs of product 
liability suits and in time dependence of recalls succeeded only in 
deepening the mystery. 

Another similarity between drug and auto recalls—and the source 
of another mystery—lies in the response of competitors. Their own¬ 
ers lose substantially when a rival product is recalled. Any favorable 
effects on the demand for substitutes from a recall are swamped by a 
more general negative effect on the industry. This is another piece of 


18 The average regression coefficient is 0.69 (I ~ 7.46) This implies lhai the typical 
company share in an industry-wide recall loss of I percent is under 1 percent. The 
proximate reason for this is that Chryslei has more volatile returns than the others, so 
when Chrysler loses I percent the others lose less. 

19 Since auto stocks generally had negative CERs during our sample period— 
especially in 1975—81 —we also have to wonder whether there really is a spillover That 
is, could the so-called spillover just be the result of other had newsi This is unlikely. For 
1975-81, the average company CER( —5, 5) during all periods when it had no recalls 
was -0.5 percent. Btjt, in the subset of these periods when us competitors have recalls, 
the mean CER is, as table 8 shows, over four times as large. So recalls to competitors 
were clearly especially bad news. 
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evidence that something much more is involved in a recall than fail¬ 
ure of a specific product. 

It is difficult to compare the magnitudes of the losses in drug and 
auto recalls, because both the frequency of recalls and the number of 
firms involved differ. Per recall, the percentage loss is much greater 
for drug recalls (6 percent vs. 1.5 percent). But auto recalls occur over 
twice as often and involve only three companies versus 19 for drugs 
(in our samples). So per company per year, auto recalls are consider¬ 
ably more costly. The average loss to rivals is roughly the same (1 
percent) for auto and drug recalls, but with about 50 rivals in the case 
of drugs versus two lor autos, the drug recalls clearly have the more 
substantial cross-firm effects. 

Our results help shed light on the degree to which the capital mar¬ 
ket might suboptimally deter production of faulty products. They 
show that, in the simple sense of the market’s not internalizing even 
the direct costs, suboprimal deterrence is no problem. Our results also 
show that to make a suboptimal deterrence story credible requires 
very generous estimates of the indirect social costs. The only source of 
such large costs we have found is in the cross-company effects. They 
are large enough to suggest a larger scope lor industry cooperation in 
product design and inspection than economists have heretofore imag¬ 
ined. 

Finally, we hope that our results have begun defining a new re¬ 
search agenda. They suggest that recall costs are like an iceberg whose 
easily visible part hides most of what is important. The challenge for 
future research is to discover just what form—for example, reduced 
sales, increased quality costs, lost “political capital”—these large, cur¬ 
rently amorphous costs take. 
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Inflation and Currency Reform 


Laura LaHaye 

Umvmity of Illinois at Chicago 


Expectations of currency reform are explicitly incorporated into a 
rational expectations version of the Cagan model of money market 
equilibrium, where expectations about the timing of relotm are de¬ 
rived from a simple government rule for choosing when to reform 
the currency. The model is estimated for three of the episodes ol 
hyperinflation that Cagan investigated, and the results support the 
hypothesis that expectations of reform were responsible for the ap¬ 
parent instability of the demand for money during (he final months 
of these hyperinflations. 


Economists have long been aware of the effects that expectations of a 
future currency reform would have on prices and the demand for 
real cash balances. In the “Discussion on Monetary Reform" at the 
1924 meeting of the Royal Economic Society, Cannan (1924, p. 158) 
explained that “if the fear of f urther depreciation could by any means 
be allayed, the holders of currency would try to enlarge their hold¬ 
ings, which would reduce prices if no more currency was printed, or 
absorb a large amount of new issue without any rise of prices if the 
press was allowed to go on for a time.” 

Later, when Cagan (1956) undertook the first modern empirical 
study of the demand for money during hyperinflations, he found that 
the holdings of real cash balances appeared to be excessive during the 
final few months of four of the seven episodes he investigated. Cagan 
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suggested that this apparent instability of the demand for money may 
have been the result of rumors of currency reform that were circulat¬ 
ing at the time. Cagan was unable to explicitly incorporate this hy¬ 
pothesis into his model in which inflationary expectations were as¬ 
sumed to be formed adaptively. He chose instead to omit the 
problematic final observations when estimating the demand for 
money, a practice that has been followed by subsequent investigators 
of the demand for money during hyperinflation. 1 

With the application of Muth’s (1961) theory of rational expecta¬ 
tions to models of the demand for money and price level determina¬ 
tion, however, it has become possible to illustrate precisely the mecha¬ 
nism Cannan was describing. As Sargent and Wallace (1973/;) have 
shown, 2 in theory the price level or rate of inflation in any period 
generally will depend on expectations about money growth through¬ 
out the indefinite future. Consequently, if individuals suddenly come 
to expect that a currency reform, which will reduce money growth 
rates, 4 will occur in the future, the price ievel and rate of inflation will 
decline immediately as individuals attempt to increase iheir holdings 
of real cash balances. 

Combining a rational expectations model of the demand for money 
with a theory about the timing of monetary reform, flood and Garber 
(1980«) have computed a time series of the subjective probability of 
immediate reform during the final months of the German hyperinfla¬ 
tion. They find that the probability of reform was generally high in 
the months prior to the November 1923 German currency reform 
and that, when Cagan’s measure of expected inflation is revised 


'See. eg, Bam; (1970), Khan (1975), Sargent (1977), and Salemi and Sargent 
(1979) Ftenkcl (1977) loo was teqiiired to oinu lilt* final months of (he German hyper¬ 
inflation in his study, but for a different reason He used the for waul premium on 
foreign exchange as a proxy for expected inflation, but the forward market was dosed 
during the final months of the hypennfiaiion. I his is untoi lunate because observations 
on the expected dept ctiaiiou of flic mark would help 10 distinguish between an unsta¬ 
ble money demand fumnon (or a nonlinear functional form, as also suggested by 
Cagan (19561) and reduced expectations of inflation relative to the predictions of 
adaptive or rational expectations models that do not incorporate the effects of antici¬ 
pated reform 

v See also Brock (1974). Empirically, Black (1972) has shown that anticipated future 
disturbances had effects on international capital Hows during World War II, using a 
model like that developed by Muth (1961). 

s The currem y reforms that are ihe subject of this study are to be distinguished from 
those that involve simply a change of c urrency units, a government announcement of 
good intentions, or a one-time reduction of the money supply designed to mop up 
excess liquidity generally resulting from price controls (see Gurley 1956). The first two 
events would probably have no effect cm the demand lor money, while the third would 
probably reduce the demand for money by creating expectations of future c apital levies 
on the holdings of liquid assets. 
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downward to reflect the probability of reform, the holdings of real 
cash balances no longer appear to be excessive in the last 3 months of 
this hyperinflation. 4 

The immediate purpose of the present paper is to establish that 
expectations of currency reform do help to explain the behavior of 
real cash balances and inflation during the final months of hyperinfla¬ 
tion in Germany and Poland following World War I and in Greece 
during World War II. In addition, by explicitly incorporating reform 
expectations in a model of the demand for money, one obtains more 
precise estimates of the key structural parameter of interest (the 
semielasticity of the demand for money with respect to the expected 
rate of inflation) than can be obtained when the observations from the 
volatile final months of these hyperinflations are omitted. More gen¬ 
erally, however, the results in this paper indicate the potential impor¬ 
tance of explicitly accounting for the effects of expectations of future 
policy changes when estimating reduced-form equations that corre¬ 
spond to a theory of economic behavior. Hence, the results support 
the essential implication of rational (as opposed to adaptive) expecta¬ 
tions emphasized by Lucas (1976), that is, that the decision rules of 
economic agents will not generally be invariant with respect to the 
policies chosen by the government. 

In Section I, a rational expectations version of Cagan’s money mar¬ 
ket equilibrium model is developed with the effects of a known future 
reform explicitly incorporated. A rule for the government’s decision 
to reform the currency is proposed in Section II, and this rule is used 
to determine agents’ expectations about the timing of reform in each 
period prior to the actual reform. The estimation procedure and 
empirical results are contained in Section 111, followed by concluding 
remarks. 


I. Money Market Equilibrium 
with a Known Reform 

The money demand function is assumed to have a form similar to 
that proposed by Cagan, with the exception that a time trend is in¬ 
cluded for the purpose of capturing any effects of the trend growth 
of real income on the demand for real cash balances. In addition, it is 
assumed that the disturbance to the money demand function («,) 


4 See also Flood and Garber (1983) in which some of (he empirical results of iheir 
1980 study are revised and some of the analysis is extended to othet episodes of 
hyperinflation. 
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follows a random walk: 5 u, = u,~ \ + t),; Er\ t = = 0 far all j # 

0, = c^. The demand for real cash balances is given as 

(in, - p,) d = 7 „ + yt ~ ot£,Tr, + l + u„ (1) 

where m and p are the natural logarithms of the money stock and the 
price level; y lt , y , and a are structural parameters (a > 0), and E,-n, + j 
= £’,(/),+ j - pi) is the mathematical expectation of the rate of inflation 
conditioned on information available at time t. 

First differencing the money demand function (1) and equating 
desired and actual cash balances in each period yields an equation 
denoted the "portfolio balance equation”: 

H, - ti, = y - cxE,tT r+ , + uE, _ |-it, + T)„ (2) 

where p, = m, — ( is the rate of growth of the money supply and tt, 

— p, — pi - i is the current rate of inflation. 

Faking expectations as oft — 1 and rearranging yields a difference 
equation for the "systematic” part of the rate of inflation, 

E ' = + ot£ ''- ,1T ' +1 “ 


which has the following solution: 1 ’ 


Ei 



Updating (3) to obtain 

E,tth i = -y + 


1 _ 

1 + a 


•X} 



(3) 


(4) 


'' The Durbm-W.ilson statistics lor both Cagan’s (1956) and Barro’s (1970) estimated 
money demand turntions are low. and Khan’s (1975) estimates of the autoregressive 
parameters are close to unity. Salemi and Saigent (1979) have also assumed that money 
demand disturbances tollow a random walk, which is consistent with the notion that the 
demand foi money depends on peimanent income, which is omitted from the equa¬ 
tion, if permanent income follows a random walk with trend. 

" III general, this solution would also include the term r|(l + a)/a)', where t is an 
arbitrary constant (resulting in the familiar nontiniquene.ss of rational expectations 
equilibria) Imposing the terminal condition that real cash balances are positive when 
the money supply is finite implies that r = 0 and that therefore "speculative bubbles" do 
not exist. The absent c of speculative hubbies has also been assumed by Sargent (1977) 
and Salemi and Sargent (1979) and has been directly tested, for the German hyperinfla¬ 
tion prior to the summer of 1923, by Flood and Garber (!980fc). Direct tests, however, 
require the alternative assumption that rational agents do not anticipate policy 
changes-an assumption that contradicts the hypothesis under investigation in this 
study. 
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substituting (3) and (4) into (2), and rearranging results in the follow¬ 
ing solution for the rate of inflation: 

30 

it, = -y + (1 - 0) 2 + e, - tv. (5) 

where 0 = a/(l + a) and e, = d'(E,n, +/ - 

A necessary condition for the existence of the solution (5) is 

lim {#>£,_ = 0. (6) 

j —'» 

In order for (6) to be satisfied for all possible paths {is,p., + ,}, it is 
necessary that |0| < 1, which is satisfied for a > 0. For |B| < 1, a 
sufficient condition for (6) to hold is that {£,p ( + ,} is of mean exponen¬ 
tial order less than 1/0. 7 That is, the money growth process must not 
be “too” explosive throughout the indefinite future. 

Flood and Garber (1980a) have characterized the probability of 
immediate reform in terms of the probability that the prereform 
process generating money growth does not satisfy (6); that is, the 
probability that, if the prereform money growth process were known 
with certainty, the price level or rate of inflation would be infinite and 
hence the currency worthless. This characterization of the probability 
of immediate reform implicitly assumes that agents always believe 
that whichever process is currently generating the rate of money 
growth will continue forever. The approach taken in this paper is to 
assume, instead, that agents may anticipate that the process generat¬ 
ing the rate of money growth will change on some future reform date. 
This implies that (6) may be satisfied even if it is known with certainty 
that the prereform process would not satisfy (6), provided that the 
reform date is finite and the postreform process does satisfy (6). De¬ 
spite the difference in approaches to dealing with the issue of antici¬ 
pations of currency reform, there are some similarities in the results, 
which will be discussed in Section Ill. 

The standard reduced-form solution for the rate of inflation in the 
absence of reform expectations is found by specifying “the” money 
growth process, computing the implied formula for expectations of 
future money growth rates, and substituting into equation (5). To be 
specific, this standard solution is referred to as the rate of inflation 
conditional on no reform (tt, ( NR), which is a function of expected 
future money growth rates conditional on no reform (£,_ ip., +; |A7f 
and 

It is assumed throughout the paper that the prereform or non- 


7 Sargent (1979a, p. 196) and Hansen and Sargent (1980) specify itiis condition for 
the existence of a solution for their models 
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reformed money growth process is an rth-order autoregression in¬ 
cluding a constant and a time trend. 8 This process is written com¬ 
pactly in vector first-order form as 

fb = g'*, (7) 

z, = Az,_ i + w,, (8) 


where g is the unit vector with the first element equal to one and all 
others equal to zero, z, is an (r + 2) vector: z,' = [p, p y _ , . . . p,_, + , I 
t + 1], A is the (r + 2) square matrix: 

a i • • • «, _ | a, c p. 

I ... 0 0 0 0 


A = 


0 

0 


0 


1 0 0 0 
0 0 10 
0 011 


and w, is an (r + 2) vector with the first element equal to the white- 
noise disturbance r, and all other elements identically zero: e, = g'w,. 

From (7) and (8), therefore, expectations of future money growth 
rates, conditional on no reform, are given as 

F„ ,p, +/ |A7? = g'A ' 4 V-t 

(9) 

(t :f - i)p, t/ |A7f = g’fA'z, - A /+ V ,) = g'A y w,. 

Finally, substituting (9) into (5) for the case in which no reform is 


H Flood and C.arber (1980a, 19806. 1983) also mo<lcl money growth as an autoregres¬ 
sive |ii ™ ess, ext hiding lagged rates <>i inflation. In contrast, Saleini and Sargent (1979) 
include lagged inflation m their money growth process, which is pan oi a general 
vei tor-auloi egressive representation lor inflation ami money growth. Contrary to the 
“model-tree" findings of Satgt-nt and Wallace (1973a), Salemi and Sargent (1979) do 
not find that inflation is exogenous with respect to money growth. The reason for going 
to the opposite extreme, excluding lagged inflation in the money giowth process, is 
basically that, with expectations of a future reform, a stable vector-autoregressive rep¬ 
resentation for inflation and money growth does not exist. More to the point, however, 
even if lagged rates of inflation would improve the ht of the nioiicv growth equation, 
excluding lagged rales of inflation will not affect the validity of the restrictions that are 
imposed to identify the parameters ol the portfolio balance equation. This follows from 
the "law of iterated projections.” In effect, excluding lagged inflation amounts to ex¬ 
cluding lagged inflation from the information set when talcing expectations. This does 
not alter the relationships (3) and (4), and hence (5), as long as £,p, = p, still applies. 
See Salemi and Sargent (1979) and Sargent (1979 b) for similar applications ol the law of 
iterated projections 
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anticipated yields the following reduced-form solution for the rate of 
inflation: 


w ,|NR = -y + (1 - 8) X 0 J g'A' +1 z,_i + £ O'g'A'w, - q, 

,, = 0 /-o 

= -y + (i - e)g'(i - eA)-'Aa f _, + g'(i - eA)-'w, - q, 

= f'z,_, + v„ (10) 


where f' = [/j / 2 . . .f r c v t w ] and v, is unanticipated inflation: v, = ir, - 
— itt / = / 0 e, - q ( . The elements off are given as 


r 

/«= (i - Z e ' a >) 

a - # i«'«, 


- 1 


f = (l - 0)(1 + a/i)fl, + e/l-t 1 

fr = (1 - 6)(I + a/])a, 

c -rr = ~y + (1 + a/l)^ + on n 

T-rr ~ (1 + a/i)v 


(ID 


In the absence of reform expectations, therefore, the money market 
equilibrium model would be estimated by jointly estimating the 
money growth equation, p, = g'Az,_ i + e,, from (7) and (8), and the 
reduced-form inflation equation (10) subject to the restrictions (11) to 
produce maximum likelihood estimates of the (r + 2) nontrivial ele¬ 
ments of A and the money demand parameters a and y. 

When, however, agents anticipate that there will be a currency re¬ 
form on the future date T, they will not use the formula (9) to predict 
all future money growth rates. In order to illustrate the effect of 
reform expectations on inflation in periods prior to reform, the gen¬ 
eral solution for inflation (5) is modified to indicate explicitly that 
expectations of future money growth rates depend on the particular 
process agents believe will be generating money growth on luture 
dates. In particular, assume that there are only two possible money 
growth processes, a nonreformed and a reformed process, and let 
ip , +/ |NR (E,_ ip., +/ |/f) denote the expectation, as of l - 1, of money 
growth in period t + j, conditional on the nonreformed (reformed) 
process generating the money growth rate in that period (t + j). 
Then, if agents know that there will be a switch from the non- 
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reformed to the reformed process between periods T and T + 1 
(henceforth referred to as a reform in period T), we have 

£/ i!*<+, = £,- i|a, +; |N£, j sT - t, 

= £,_,p, +7 |fi, j > T - t. 

In this case, the systematic part of the rate of inflation in period t, 
conditional on a reform occurring at T, is given as 

T-t 

E, - itr,|7 = -y + (1 - 6 >X e / £ ( -Uc, +; |A'« 

/-<> 

■Xj 

+ ( 1 - 0 ) X «'£/-, s T, ( 12 ) 

i=r~(+ i 

= E,- t 7r,|/f, / > T. 

From (12) it is easy to see that even if the nonreformed process 
were so explosive that the boundary condition ( 6 ) would not be 
satisfied if the nonreformed process were to continue forever, 
E, , 71 , 17 ’ will he finite for finite T and a reformed process that is not 
too explosive. If the nonreformed process is not so explosive as to 
result in £,_ i 7 r,|,V/? = so, ( 12 ) can be manipulated to further illustrate 
die ef fect of a future reform: 

x 

£/ ,Vi\T = -7 + (I - 0) X e'£,-,n, +y |(VK 

,=(l 

x 

+ 0 r ,+ , (l - 0)^0'(£,- lt t rt ;Hl« - E t -nLT+ } + l \NR) 
1 

= E, . ,ir,|AIK + 0 7 - ,+ l (£,_, 7 r r+) |« - £,_ i|A£«). (13) 

t £ T. 

In periods prior to a given reform date, (13) indicates that the 
systematic pari of the rate of inflation is equal to the rate that would 
Ire observed if no reform were expected plus a fraction of what will be 
referred to as the “reform effect,” that is, the difference between the 
rate of inflation that is expected to exist in the period immediately 
following reform and the rate that would have been expected if it 
were believed that there would never be a reform. By the definition of 
a reform, as opposed to a more general change of monetary regimes, 
the reform effect is negative. For a given reform date, the fraction 
( 0 C- /+ i ncreases geometrically as the reform date is approached. For 
a given reform effect, therefore, the model predicts that the rate of 
inflation would decline and consequently desired real cash balances 
would grow, relative to the predictions of a model that failed to ac- 
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count for expectations of reform, as the reform date is approached. 
This prediction corresponds closely to Cagan’s (1956, p. 57) observa¬ 
tion that, for the German hyperinflation, “the horizontal distance 
between the regression line and observations for the three preceding 
[the reform] months is progressively greater for the months nearer to 
the end of hyperinflation. The configuration of these observations is 
thus consistent with an increasing effect on real cash balances of ex¬ 
pectations of currency reform." 9 

The analysis above describes the factors determining the evolution 
of inflation for a given reform date, T. If we maintain the assumption 
that agents know, in periods prior to T, that a reform will occur on the 
fixed date T, and if we assume that the reform entails a discrete switch 
from the nonreformed money growth process characterized by the 
matrix A to the reformed process characterized by a similar matrix B, 
then the following estimation strategy is appropriaie. Use the two 
money growth processes to obtain formulas for j jul, + y and £,p., +; 

and substitute these formulas into (5) to obtain 


M-t 

= g'*, 








= Az,_ , 

+ w„ 

t £ 7’, 






= Bz, -1 

+ w„ 

t > T, 












(14) 

-n5 

= ~y + 

(1 - 

0)g'[(I 

- 6A) ‘ 

i + 0 r /+ i (I 

- 6B) 

'(B - A) 


x (I - 

- 0A) 

-'A 7 -'] 

Az,-i 

+ ( i - «H 

- Tfr, 

t £ 7, 


= -y + 

(1 - 

0)g'd - 

0B)' 1 

Bz, . + (y~ 

hrH 

- l> T. 


With data from before and after an apparent reform, the system (14) 
could be estimated, subject to the restrictions that appear explicitly in 
the inflation equation, to produce estimates of the reform date, T, the 
parameters of the nonreformed and reformed money growth pro¬ 
cesses (the elements of the first row of A and B), and the money 
demand parameters a and y. 

For a number of reasons the estimation strategy above is not pur¬ 
sued in this paper. First, the estimation problem is greatly simplified if 
it is assumed that the expected reform effect is a constant, that is, 


9 Contrary to the suggestions of Cagan (1956) and Friedman (1956), the growing 
reform effect is not the result of directly including longer-term expected inflation rates 
in the money demand function. It reflects the fact that inflation and expected inflation 
“at the moment" depend on expectations of money growth, and hence inflation, in ihc 
more distant future. 
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(A',tt, + 1|/? - £,17-/-+ ||/V/f) = E Ait {/ s T), a constant that appears as a 
free parameter in the inflation equation. It can he shown that, with 
this simplification, the rate of inflation in periods prior to a known 
reform is given by 

17,17’ = ir,| NR + 0 7 '-' + ! £Air, t £ T. 

In addition, it is useful to distinguish between the actual reform date, 
which is a characteristic of money supply behavior, and expectations 
about the timing of reform, which influence the behavior of inflation 
and may change over time. Hence, the actual reform date ( T) is first 
identified in single-equation estimates of the money supply process. 
In the following section a rule for choosing when it) reform the cur¬ 
rency is proposed. Then an “expected" reform date variable (T,) is 
generated as the date on which it is expected (at t) that the conditions 
for immediate reform will be satisfied . 10 

Finally, given the prereform sample period (determined by T) and 
the expected reform date variable (/’,), the joint inflation-money 
growth system is estimated for the prereform period only. The unre¬ 
stricted (by the portfolio balance equation) specification of this system 


m = g'Az, 1 + e„ 


IT/ 


l S T, 


f'z, 


1 


+ 


A Air + 


( 15 ) 


where the f ree parameters of the system include the (/ ■+ 2) nontrivial 
elements of A, the (r 4- 2) elements of f plus \, and E Air. Imposing 
the restrictions ( 11 ) and \ = 0 overidentifies the money demand 
parameters a and 7 , resulting in a system with (r + 2 + 3) free 
parameters. Likelihood ratio statistics may then be computed for test¬ 
ing the null hypothesis that the restrictions used to identify a and 7 
are valid. 


1,1 “Expected" appeals in <|uoi.unm maiks Icecause the expected teform dale variable 
is not teally an expec tatiou 111 the sense of being the mean of a probability distribution 
over future refoim dates. 

11 The inflation ecjuaiion no longer includes the term indicating that it is conditional 
on the reform dale because it is assumed, lor the purpose of estimating the model, that 
the actual tale of inflation is identical to the rate of inflation conditional on reform 
ex curling 011 the expected date T, This is a simplifying assumption only. Given the 
probability of teform on futtne dates, one could appiopnately specify the tale of 
inflation as the probability-weighted average of inflation conditional on the alternative 
refot m dates However, since this probability distribution will generally depend on the 
parameters of the money growth process and the money demand function (see 11 14 
below and Flood and Garber (l‘)HOr/l), the joint estimation of the money supply and 
demand parameters would become an extremely complicated problem. 
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II. The Determination of Expected Reform Dates 

In this section, a rule for choosing when to reform the currency is 
proposed. Assuming that agents know the reform rule, it is then 
possible to determine, in periods prior to an actual reform, the date 
on which agents expect that a reform will occur. Ultimately, the rule 
takes the form of an (S, s) inventory strategy: if the money growth 
rate is expected to exceed a fixed critical value (p.,,) in the next period, 
then undertake a currency reform immediately that will reduce the 
expected rate of inflation (following reform) to a given target value 
(if). In periods prior to the actual reform, the expected reform date 
variable is then generated by finding the date on which the money 
growth rate is expected to first exceed the critical money growth rate. 

The scenario that generates the proposed reform timing rule is 
briefly described as follows. 12 Suppose that it is costly to adjust tax 
revenues and government expenditures so that, when adjustments to 
tax and spending policies are not undertaken, money creation be¬ 
comes an exogenously determined residual source of revenue for the 
government. Furthermore, suppose that in the absence of these ad¬ 
justment costs there exists an optimum rate of inflation (if) the gov¬ 
ernment would achieve by appropriately adjusting its tax and spend¬ 
ing policies. Finally, suppose that these adjustment costs are of the 
lump-sum variety, independent of the size of tax and spending 
changes to be undertaken. Then a reform is characterized as under¬ 
taking the lump-sum adjustment costs (reform costs) in order to re¬ 
duce inflation to the optimum level (if). 

At each point in time the government is faced with the decision 
whether to reform immediately or to postpone the reform at least 1 
period. If the government chooses to postpone the reform, the cost 
savings would be equal to the (1-period) interest savings on the re¬ 
form costs. These may be interpreted as the constant marginal 
benefits of postponing a reform. On the other hand, if the govern¬ 
ment chooses not to reform the currency in the current period t, the 
lowest possible value for the expected rate of inflation would be 
E,n /+1 |t + 1, that is, the expected rate of inflation conditional on 
reform occurring in the next period. If social costs, which the govern¬ 
ment seeks to minimize, are a monotonically increasing function of 
the (absolute value of the) deviation between expected inflation and 
the optimum rate of inflation, then the marginal costs of postponing a 
reform (at l ) will be positively related to £,ir,+ i|f + 1. Consequently, 
there exists some critical value (ir rr ) such that, if £',ir, + i|/ + 1 = ir rr , 
the minimum marginal costs of postponing a reform will be equal 


, J This is described in greater detail in LaHaye (1980). 
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to the marginal benefits. Hence, the rule for choosing when to reform 
the currency may be stated as: if £,w, + i|< + 1 > ir rr , then reform 
immediately so that E,ti i+ 1 = 7f. 13 

The rule for choosing when to reform the currency may be restated 
in terms of the observable 1-period-ahead expected money growth 
rates (conditional on no reform) by noting that, in general, 

E,v, + , = (1 - 8)(£,p. /+ i + ot£/ir, + a - y), 
and that, if a reform occurs in period t + 1, E,Tt, + 2 = W. Hence, 
b>,+ i |t + 1 = (1 - 0)(£/|i.< f 1 \NR + olv - y), 
and the reform timing rule becomes: 

If £,p-,+ i\NR > (x,,, then reform immediately (i.e., choose T — t), 

where p,, r = (1 4- ot)TT rr — oitt + y. 

Assuming that agents know the government’s timing rule, then in 
any period /, prior to the actual reform date T, agents will expect that 
the reform will take place on the date t, that satisfies £',(xf,+ 1 \NR > 
g. r ,. 14 Although this condition will not generally result in a closed- 
form solution ft)f the expected reform date series, 15 solutions can be 
obtained by computing (for each period I) j-step-ahead forecasts of 
the money growth rate until a value is found that exceeds the critical 
money growth rate. 

Given a value for the critical money growth rate for each country, it 
would be a simple matter to test the empirical validity of the proposed 
reform rule and to compute the expected reform date series. Alterna¬ 
tively, one could assume that the proposed rule is valid, find the set of 
values for the critical money growth rate for which the rule would 
have resulted in a reform occurring on the actual reform date, and 
use one or more dements of this feasible set to compute expected 
reform date series. Because of the lack of independent empirical 
measures of the critical money growth rates, the latter strategy is 


” One could probably obtain a similar condition ior reform by considering the 
government’s revenue requirements (see discussions in Cagan (19.56] and Barro [1972]) 
raihei than a social cost minimization pioblcm. However, it does not seem that the 
steady-state revenue-maximizing rate of (expected) inflation would be the appropriate 
criterion. Departing from steady-state considerations, the model would have to be 
modified to avoid the possibility that money growth (and revenues) could be infinite 
during the reform period when expected inflation is anchored to the reform rate ff. 

M One might be able to construct a probability distribution of reform dates by com¬ 
puting the probability that money growth would exceed the critical money growth rate 
on f uture dates. This distribution would depend on the distribution generating money 
growth rales as well as the parameters (including money demand parameters) that 
determine the critical money growth rate 

11 An interesting exception is when the money growth process is a simple first-order 
autoregression: £,p., + ^ = u'p, In this case T, = [ln(p. r ,/p,))/ln(a) + t - 1, and the ex¬ 
pected reform effect is a constant. EAv = ((1 — 0)/(l — 0o)]p„ — ft. 
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pursued. The procedure for determining the f easible set of values lor 
the critical money growth rate is described in Section III. 

Although the scenario described in this section is extremely sim¬ 
plistic, it does capture the dominant characteristics of hyperinflation 
and currency reform that distinguish the episodes studied in this 
paper from other (more stable) periods in monetary history. Included 
among these characteristics are the growing importance of money 
creation as a revenue source and the explosive acceleration of money 
growth during the final months of hyperinflation in addition to the 
major fiscal policy reforms that accompanied the currency reforms. 16 

III. Estimation of the Model—Empirical Results 

The first step in the estimation of the model is to determine the 
appropriate length (r) of the autoregressive process for money 
growth (where the objective is to use the lowest possible order process 
in order to economize on degrees of freedom) and the actual reform 
date (T) for each country. Next, the expected reform dates (T,) are 
determined. Finally, given r, T, and T„ for each country, maximum 
likelihood estimates of the model (15) are obtained and various tests 
are conducted. 

Reforms appear to have occurred in Germany in November 1925, 
in Greece in November 1944, and in Poland in January 1924. These 
are the months in which hyperinflation ended in these countries ac¬ 
cording to Cagan’s definition and the months in which the rate of 
monetary expansion attained a maximum and was followed by a 
sharp decline. These months also correspond to the final month in 
which each of the governments financed deficits by printing money. 

Given these priors, OLS estimates of five autoregressive models of 
the money growth rate were obtained for each country with several 
sample periods ending in successive months leading up to and beyond 
the apparent reform months given above. First- through fourth- 
order autoregressions and a first-order autoregression on the first 
differences of the money growth rate were estimated, and all models 
include a constant and a time trend. Of the five models estimated for 
each sample period (for each country), the one that produced the 
lowest standard error of the equation was chosen as the appropriate 
model for that sample period. For Greece and Poland the appropriate 
model was an AR(3) for all sample periods. For Germany, an AR(4) 
produced the lowest standard error in the early sample periods (i.e., 
in sample periods ending before September 1923). Hence, although 
an AR(3) produced a somewhat lower standard error for the longer 

1B See LaHaye (1980), Sargent (1982), and Vaubel (1983) lor a fuither discussion oi 
these fiscal reforms and references to earlier literature on the subject. 
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sample periods, an AR(4) was chosen as the appropriate model of the 
money growth process for Germany. 

The next step is to determine when reforms actually occurred in 
each country. The objective here is to establish that the month 
identified as the actual reform date for each country satisfies the 
definition of a reform as a switch to a “reformed” money growth 
process and does not merely represent the date on which the govern¬ 
ment announced that it was undertaking a reform. However, there is 
a good deal of evidence to suggest when the reforms actually oc¬ 
curred. Consequently, the search for statistically identified actual re¬ 
form dates is limited to within a few months on either side of the 
apparent reform months described above. Assuming that the re¬ 
formed money growth process has the same functional form, AR(r), 
as the nonreformed process, possible reformed processes are es¬ 
timated for several sample periods beginning in successive months 
leading up to and beyond the apparent reform month. I’he actual 
reform date T is then identified as the final month of the non¬ 
reformed sample for which the sum of squared residuals for the full 
sample (the nonreformed sample ending in period T plus the re¬ 
formed sample beginning in period T + 1) is minimized. 

The estimates of the nonreformed and reformed money growth 
processes corresponding to the identified reform date (T) are re¬ 
ported in table 1. Not surprisingly, the reform dates for Germany and 
Greece are unambiguously identified as November 1923 and Novem¬ 
ber 1944, the apparent reform dates described above. For Poland, the 
actual reform dale is also identified as the apparent reform date (Jan¬ 
uary 1924); however, the sum of squared residuals increases by only 
about 2 percent if February 1924 is chosen as the actual reform date 
instead of January 1924. This is probably because money growth 
declined much less rapidly in Poland than in the other two countries. 
The Polish case illustrates some of the difficulties one might en¬ 
counter in trying to identify less extreme reforms that are followed by 
transition periods in which the money supply continues to grow at 
reasonably high rates in order to accommodate the growing demand 
for real cash balances. 

I’he final step is to determine the feasible set of values for the 
critical money growth rate for each country and then to use the upper 
and lower limits of this set to compute two expected reform date 
series for each country. For this purpose, 1— 9-step-ahead forecasts of 
the money growth rate for each of several months prior to and includ¬ 
ing the actual reform month are computed and reported in table 2. 17 


17 For computational convenience, months in which the reform is expected to occur 
more than 8 months ahead are treated as it no reform is expected, since expeclalions of 
reform on such distant dates will have little effect on the demand for money 
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Insjiection of table 2 reveals that, for each country, the 1-step- 
ahead forecast of the money growth rate is greatest in the actual 
reform month, as predicted by the proposed reform rule. Since the 
critical money growth rate that triggers a reform must be less than the 
1-step-ahead forecast of the money growth rate in the actual reform 
month (otherwise the reform would have been postponed, according 
to the reform rule), this 1-step-ahead forecast is the upper limit of the 
feasible set of values for the critical money growth rate. Similarly, the 
critical money growth rate must be less than or equal to the 1-step- 
ahead forecast of the money growth rate in all months prior to the 
actual reform (otherwise the reform would have occurred on an ear¬ 
lier date). Hence, the lower limit of the feasible set of values for the 
critical money growth rate is the largest 1-step-ahead forecast in the 
months prior to the actual reform. These upper and lower limits are 
indicated in table 2. For any particular value of the critical money 
growth rate (p.,,), the expected reform date series (T,) is then deter¬ 
mined by finding, in each period t , the /-step-ahead forecast (£,p, + .) 
that first exceeds (or equals for the upper limit) the critical money 
growth rale such that T, = min(f + j — 1) for£',p /+; > (S) p fr . Two 
time series of the expected number of months until reform (f, - t ), 
corresponding to the upper and lower limits of the feasible set of 
values for the critical money growth rate, are also reported in table 2. 

For Germany, it is interesting to compare the time series of the 
expected number of months until reform (T, — t) with the findings of 
Flood and Garber (1980«) about the probability of reform during the 
months fjrior to currency reform. Despite a variety of differences 
between the two models, their pattern of probabilities of no reform 
during the last 3 months of the German hyperinflation is similar to 
the pattern of the expected number of months until reform displayed 
in table 2. 1M In particular, their probability of no reform increases 
substantially in October and then declines sharply in November, as 
does the expected length of time until reform in this paper. This 
similarity reflects the fact that an unexpectedly low rate of money 
growth in October caused agents to revise downward the probability 
that the nonreformed money growth process was explosive in the 
Flood and Garber model while increasing the expected length of time 
needed for the money growth rate to exceed the critical value in the 
model presented in this paper. 

Finally, having determined r, T, and the series T, for each country, 


1B The similarity i.s even greater with then revised series reported in Flood and 
Garber (1983) for which the probability of no reform is essentially unity prior to 
mid-August 1923. Comparisons tor Greece and Poland also reveal a number of 
similarities. 
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we are ready to estimate the inflation-money growth system (15). 
Maximum likelihood estimates of the parameters of the restricted 
model are reported in table 3 in addition to the marginal significance 
levels of the likelihood ratio tests of the validity of the restrictions. 19 
All of the estimates of the money demand parameter, a, are statisti¬ 
cally significant and of the correct sign, as are the estimates of the 
reform effect, E Air. In addition, the restrictions of the model fail to 
be rejected at the 5 percent level of significance in all cases. Hence, the 
results appear to provide support for the model of portfolio equilib¬ 
rium with anticipations of currency reform during the final months of 
hyperinflation. 

For the purpose of comparison, the model was reestimated with the 
reform effect constrained to equal zero (i.e., under the hypothesis 
that no reform was anticipated during the final months of these hy¬ 
perinflations). The results are reported in table 4. In addition to the 
fact that the hypothesis that E An = 0 is easily rejected at standard 
confidence levels, the following observations are also of interest. The 
estimates of a are not significantly different from zero for Germany 
and Poland, and the restrictions imposed in order to identify a are 
rejected for Germany and Greece. Consequently, the data do not tend 
to support the model of portfolio equilibrium when expectations of 
currency reform are not explicitly included in the model. 

Finally, we turn to the issue of whether or not the inclusion of 
reform expectations in the model has sufficiently eliminated the insta¬ 
bility that prompted Cagan and others to omit the final observations 
in their estimation of the demand for money during hyperinflation. 
For this purpose, the model was reestimaied for the “stable” period 
prior to the onset of expectations of reform. As the results in table 4 
indicate, the restrictions of the model fail to be rejected in the early, 
stable period; however, they also fail to identify a precisely. 

In order to test for stability of the money demand parameter a 
alone, the model (15) is respecified to include two a parameters, one 
for the early, stable period and one for the final months. Perhaps 
surprisingly, after inspection of the results in tables 3 and 4, the 
estimates of the early period a’s that are obtained are virtually identi¬ 
cal to those reported in table 3 (and to the later period a). The likeli¬ 
hood ratio test strongly fails to reject the hypothesis that a is the same 
in both periods for all three countries. The marginal significance 


l!l For Germany, the two expected reform date series are identical, and tor Poland 
the series are so similar that the estimated model is the same using either series. For 
Greece, the estimated models are similar using the two expected reform date series; 
however, the upper limit series - t)„ results in a marginally higher value of the 
likelihood function than when (T, — l)i is used (148.6 vs. 148.0), and these estimates 
are reported in table 3. 
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TABLE 3 

Inflation and Money Growth with Expectations of Reform 

A 

H., = g'Az, , + it, = f'z, + Q r '~‘ * ‘EAv + v, 



Germany 
(June 1920- 
November 1923) 

Greece 

(December 1941- 
November 1944) 

Poland 

(February 1921- 
January 1924) 

<v 

-.07 

- 08 

-.02 


(.05) 

(.03) 

(.01) 

T P 

.006 

-.001 

002 


(.003) 

(.002) 

(.001) 

a, 

.88 

2.57 

1 60 


(.16) 

(.13) 

(.15) 

a-i 

1 46 

-2.74 

-.96 


(.24) 

(.33) 

(.26) 


-.97 

1.77 

.39 


(.54) 

(.24) 

(.16) 

a A 

- .38 




(.43) 

.01 

.04 

-.01 


(02) 

(.02) 

(03) 

y 

.03 

- 04 

.03 

CL 

.84 

58 

2 66 


( 14) 

(09) 

(1.23) 

EAu 

-8.71 

-4.90 

-.59 


(1.58) 

(111) 

(27) 

OV 

.256 

080 

045 

(Tr, 

.299 

.215 

.214 

OVyi 

058 

006 

.006 

L 

126.0 

148.6 

175.6 

\ 

5.60 

1.48 

4.20 

Marginal 

significance 

.347 

830 

.380 


,\tm —1 lie results writ obtained using Wvmn s (l‘)77) RtMMt i progi.un r n was left unconstrained and tbt 
lestrittums (11) were used to totnpuie /. is the maximized value oi the log likclihcmd luiution (excluding 
constants), \ is the likelihood ratio swmuc appropiiutr for testing the lestrunons, and marginal signifuame is the 
marginal sigmhiaiue level of the likelihood ratio test 


levels of this test are .81 for Germany, .80 for Greece, and .57 tor 
Poland. To some extent, this finding probably reflects the fact that the 
money growth process is assumed to be stable and that therefore the 
full-period money growth parameters appear in the restrictions used 
to identify the early period a. 

Since there are not sufficient degrees of freedom to allow lor insta¬ 
bility of the money growth parameters while testing for the stability of 
a, a general test of structural stability is conducted instead. Consider 
estimating the model for the full period, with no reform effect, with 
dummy variables included in both the inflation and money growth 
equations for each of the months after the early, stable period (i.e., 
dummy out the final observations). The value of the maximized log 
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likelihood of this "unrestricted” system can be computed directly 
from the value of the maximized log likelihood for the early period. 20 
These values are 167.5 for Germany, 177.9 for Greece, and 179.2 for 
Poland. The anticipated reform model (15) may then be interpreted 
as a restricted version of this system for which k restrictions (that the k 
coefficients on the dummy variable equal zero) are imposed in order 
to identify the one reform effect (EA-n). The likelihood ratio statistics 
appropriate for testing the validity of the k - 1 overidentifying re¬ 
strictions are, therefore, 83.0 for Germany (h — 6), 58.6 for Greece (A 
= 8), and 7.2 for Poland (k = 6), with corresponding marginal 
significance levels of zero for Germany and Greece and .21 for Po¬ 
land. Hence, the results of this test soundly reject the hypothesis of 
structural stability for Germany and Greece, although not for Poland. 
The strong rejections for Germany and Greece appear to reflect 
primarily instability of the money growth processes as opposed to the 
money demand functions. This interpretation is suggested by the 
apparent instability of the parameters of the money growth processes 
and the increase of the standard error of the money growth equations 
(o r ) as the sample is extended from the early period to the full period, 
while the standard error of the money demand (or portfolio balance) 
equation (tr^) does not increase as the sample is extended to include 
the full period with anticipations of reform (compare tables 3 and 4). 


IV. Concluding Remarks 

In summary, the results presented in this paper suggest that currency 
reforms were anticipated during the final months of three hy perinfla- 
tions and that anticipations of reform exerted a significant negative 
influence on inflation during these final months. Hence, Caiman's 
understanding of price level determination, which can be described 
formally by a rational expectations equilibrium model of the money 
market, is supported, as is Cagan’s suggestion that expectations of 
reform were responsible for apparently excessive holdings of real 
cash balances during the final months of these hyperinflations. In 
addition, we find that, by including reform expectations in the model 
in a fairly simple manner, we are able to convert a model that fails to 


20 By dummying out the final h/‘2 observations, the parameter estimates, estimated 
residuals, and sums of squares will be identical for the lull period (which itu ludes ,V* 
observations) to the values obtained for the early period alone, which includes N ~ N* 
— (*/2) observations. Letting £ denote the matrix of sums of squares, L denote the 
maximized value of the likelihood function (excluding constants) for the early period, 
and L* denote the maximized value of the likelihood function for the full period with 
the dummies, we have L = - (N/2)lnl(lW z )|i|). L* = — (N*/2)lnl( l/N* a )|X|], which 
implies L* = N* ln(.V*/,V) + (A 
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be supported by the data into one that is, and we are therefore able to 
use data from apparently unusual periods of time to estimate struc¬ 
tural parameters of interest. 

It is important to bear in mind, however, that several simplifying 
assumptions have been made for the purpose of estimating the 
model. First, the model specifies the relationship between the rate of 
inflation, conditional on a reform occurring on the expected date T,, 
and the regressors, whereas the actual rate of inflation is used in the 
estimation. Second, the expected effect of the reform has been as¬ 
sumed to be constant. Finally, the estimation of the joint inflation- 
money growth rale system assumes that the money growth process is 
stable throughout the entire prereform period, whereas the evidence 
indicates that this is not the case, at least for Germany and Greece. 

Although it is highly unlikely that these possible specification errors 
have produced results that incorrectly support the expected reform 
hypothesis, they may have significant implications for the consistency 
of the point estimates of the money demand parameter a. In particu¬ 
lar, the assumption that agents used the fairly explosive parameters of 
the money growth processes estimated for the full period to forecast 
money growth rates during the less explosive, early stages of the 
hyperinflations in Germany and Greece may lead to a downward bias 
to the estimates of a. With the weekly data that are available for the 
German episode, it should be possible to account properly for the 
monetary instability that has largely been ignored in this paper but 
that appears to be c losely related to inducing reform and expectations 
that a reform will occur. In addition, by concentrating on the popular 
German hyperinflation, it may be possible to include in the analysis 
the relationship between fundamental lax and spending changes and 
the actual and expected success of a reform, as well as the relationship 
between spending and money creation during the hyperinflation. 


Data Appendix 

Germany 

For January 11)20 through December 1923, money (Af) is the quantity of 
Keichsbank notes plus other currencies (including Remenmarks alter 
Novembei 15, 1923) circulating in the middle of the month. Emergenty note 
issues are not included. Flood and Garber (1930 h, app. B) have constructed 
these data for the period ending December 1922 by combining weekly 
Reichsbanknote data (reported in the Economist) with interpolations to mid- 
month of other currencies. For 1923, M is the mid-month gold value of the 
motley stock, reported in Statistisches Reichsamt (1925£, p. 47), converted to 
nominal values by multiplying by the appropriate daily paper mark price of 
gold. For 1924-26, end-of-month money stork figures, reported in various 
issues of the StaUstischcs Jcihrbuch fiir dm Deutsches Reich, are log-linearly inter- 
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polated to mid-month. The price level (P) is the monthly average wholesale 
price index, reported in Statistisches Reichsann (19256, p. 17). 


Greece 

M and P are reported in Cleveland and Delivanis (1949, pp. 178-95). The 
end-of-month quantity of Bank of Greece notes in circulation figures has 
been log-linearly interpolated to mid-month for comparabiliiy with the price 
level series, which appears to be a monthly average index prior to November 
1944. P is an index of the cost of food in Athens. The November 1944 
observations for M and P are reported for the tenth of the month; the rates of 
money growth and inflation for the period between the middle of October 
and November 10 have been adjusted to a 51-day basis. 


Poland 

Notes in circulation at the end of the month are reported in Young (1925, 
2:347-48, 353) for the period ending in March 1925, and in the International 
Abstract oj Economic Statistics (1934, pp. 168—69) for the later period. These 
data have been log-linearly interpolated to the middle of the month for com¬ 
parability with the monthly average wholesale price index, reported in Young 
(1925,2:349,352). 
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The Effect of Cohort Size on Earnings 
Growth: A Reexamination of the Evidence 


Mark C. Berger 

University of Kentucky 


This study reexamines the effect of cohort si/e on earnings growth. 
Finis Welch finds that individuals in larger cohorts experience de¬ 
pressed earnings conditions on entry into the labor market but that 
their earnings grow at faster rates than in smaller cohorts. Thus, a 
large portion of the cohort size effect on earnings dissipates after a 
few years. However, using data almost identical to those of Welch, 
but estimating less restrictive models, it is found here that cohort si/e 
not only depresses earnings at entry but also seems to slow down 
early career earnings growth. The evidence suggests that earnings m 
larger cohorts do not approach “normal” levels after a brief period 
in the labor force. Rather, the negative cohort si/e effect on earnings 
appears to worsen with experience. 


I. Introduction 

During the 1970s the peak baby-boom birth cohorts entered the labor 
market. Evidence from several studies clearly indicates that most of 
these new workers were confronted with the prospect of depressed 
earnings when they entered the labor market (Anderson 1978; 
Freeman 1979; Welch 1979; Easlerlin 1980; Grant and Hamermesh 
1981; Berger 1983; Stapleton and Young 1984). But will they also 
face depressed conditions over time as they progress through their 
careers? This depends on the effect of cohort si/e on earnings 
growth, an issue first examined by Finis Welch (1979) in an article in 
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this Journal. He finds that the depressant effects of cohort size on 
earnings decline rapidly early in a worker’s career, although they 
never disappear entirely. In other words, early career earnings grow 
at faster rates in larger cohorts than in smaller ones. 

While Welch is undoubtably correct that larger cohorts face de¬ 
pressed earnings conditions at entry, it is somewhat less likely that 
they experience faster early career earnings growth. This latter 
finding is obtained from fairly restrictive analytical and empirical 
models. Welch’s results also depend on arbitrary identifying assump¬ 
tions about experience, period (samjtle year), and cohort (entry year) 
effects on earnings. Since experience plus entry year equals sample 
year, these effects can never be jointly identified, even with pooled 
time-series cross-section data. 1 Welch “solves” the identification prob¬ 
lem by restricting cohort effects on earnings to appear as cohort size 
effects, thus breaking up the exact linear relationship among the 
experience, period, and cohort variables. This amounts to replacing 
entry year in the earnings equation with cohort size, which is con¬ 
structed st) that it varies over the career for each entry cohort. 

This study reexamines the effects of cohort size on earnings 
growth. The restrictions inherent in Welch's theoretical and empirical 
models are discussed first. Retaining his basic identification assump¬ 
tions with regard to experience, period, and cohort effects, Welch’s 
regression equations are reestimated and updated, and alternative 
estimates based on less restrictive models are presented. Contrary to 
Welch, the estimates obtained here indicate that early career earnings 
grow more slowly in larger cohorts. Thus, adverse cohort size effects 
on earnings do not diminish rapidly as Welch suggests and may actu¬ 
ally increase throughout the careers of individuals in large cohorts. In 
other words, the effects of large swings in the demographic composi¬ 
tion of the labor force on the structure of earnings do not appeal to 
he temporary in nature. 


II. Cohort Size and Earnings Profiles 

Welch arrives at conclusions concerning the effects of cohort size on 
earnings profiles by estimating a model in which the worker passes 
through a series of phases over the career. In applying this model to 
the data, Welch assumes that the career is made up of two phases: 
learner and worker. In the eatlv stages of the career, a worker’s lime 


1 Un kiiwn and Robb f I98.S) provide a < oinplelc treatment of I Ins issue. Nine I hut the 
identification problem here is ■ ri terms of ex permit c, period, und < ohort effects instead 
of the more common age, period, and 10H011 ef le< ts. The difference is due to defining 
roliort m terms of entry yeai instead of birth year. 
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is divided between working and learning activities. The proportion of 
time spent in learning activities decreases linearly until the individual 
becomes a fully vested worker. The depressant effects of cohort size 
diminish as workers spend less and less time in learning activities. 
When the transition to a fully vested worker is complete, the cohort 
size wage depression remains constant throughout the remainder of 
the career. 

One key assumption that Welch makes is that the speed of transi¬ 
tion from learner to fully vested worker is exogenous and thus inde¬ 
pendent of cohort size. In other words, large and small cohorts pro¬ 
ceed through the career at the same speed. However, one of the 
effects of increases in cohort size may be to delay the transition to 
worker status, possibly due to congestion and lower-quality (or 
higher-cost) learning activities. Larger cohorts would then be ob¬ 
served to have flatter early career earnings profiles and slower rates of 
earnings growth than smaller cohorts. Therefore, in a more relaxed 
version of Welch’s career phase framework, the cohort size-earnings 
growth relationship is not necessarily positive and in fact boils down 
to an empirical question. 

Welch (1979, p. S8i) also argues that bis result has intuitive appeal 
from the standpoint of optimal investment in human capital. When 
large cohorts enter the labor market, they fare depressed earnings 
levels and low opportunity costs of investment activities. Therefore 
they undertake larger amounts of investment than do individuals in 
small cohorts and experience faster rates of earnings growth. But in 
addition to ignoring the effects of cohort size on the returns to human 
capital investments, this reasoning assumes that investment costs de¬ 
crease as cohort size increases. It is possible for cohort size to have a 
positive or negative effect on both the costs and the returns to invest¬ 
ments in human capital."’ Workers in large cohorts may in fact choose 
Hatter earnings profiles, depending on the relative magnitudes of 
cohort size effects on the returns and costs of investments in training. 
Thus, contrary to Welch’s assertion, the effect of cohort size on earn¬ 
ings growth is actually indeterminate from an investment in human 
capital perspective. 

III. Empirical Models 

In order to investigate cohort size effects on earnings profiles, Welch 
(1979) estimates annual and weekly log earnings equations for white 


2 Berger (1984, pp. 583-85) examines this question in more detail For example, 
increases in cohort size may result in increases instead ot decreases in the costs ol 
human capital investments since the opportunity cost ol those providing the training 
may have risen. 
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males with March Current Population Survey (CPS) data covering the 
period from 1967 to 1975. The sample is segmented into four school¬ 
ing completion groups (8-11, 12, 13-15, and 16+ years), and within 
each group the data are aggregated into 396 experience (44)—sample 
year (9) cells. The estimation takes place using these aggregated data 
with each observation weighted by the number of earners in the cell. 

In each earnings equation, Welch allows for separate cohort size, 
experience, and period effects while also controlling for other factors. 
Cohort size (COHORT) is expressed within each schooling group as 
the log of a moving average of individuals in adjacent experience 
groups relative to the total number of individuals with that level of 
schooling. 1 Experience (F.XPER) is measured using the “full density” 
approach developed by Welch and Could (1976). The earnings ef¬ 
fects of cohort size and experience are allowed to vary over the career 
by including EXPERT, an early career spline variable (SPLINE), and a 
cohort size—spline interaction variable (COHORT*SPLlNE). Period 
effects are captured by the aggregate unemployment rate for white 
males (IJNEMP), a time trend (YEAR), and the proportion of each 
experience—sample year cell not working (NOWORK). NOWORK 
also controls for selectivity bias, as does a variable measuring the 
proportion of each cell having its income imputed by the Census 
Bureau (INCIMP). The proportion of the cell working part time 
(PARTTIME) is included in one of the weekly earnings equations as a 
control for hours worked. 

Welch’s weekly earnings specification (including PARI TIME) can 
be written 

In(WKEARN) = b„ + ^EXPER + LT.XPER 2 + ^SPLINE 

+ 5,,COHORT + Z> 5 COHORT*SPLINE 

+ ft (i UNEMP + (> 7 YEAR + 6 hNOWORK (I > 

+ MNCIMP + b, (> PARTTIME + c, 

where the error term e is assumed to have classical properties, ignor¬ 
ing the possibility of serial correlation over time or contemporaneous 
correlation across schooling groups. Welch's annual earnings 


If N t] is the number of individuals m schooling group j with t years of experience. 


Welch’s cohort size measure can be expressed as 
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■At,. 


X + rA- 


A, 






The weights are sealed up to sura to one when experience equals one or two. 
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(YREARN) and other weekly earnings equations can be expressed 
similarly. 

A major problem with equation (1) is that it allows for only a fairly 
restrictive cohort size-experience interaction. Note that the spline 
variable is defined as 

CVpi'D 

SPLINE = l - - - -• , EXPER < EXPER', 

EXPER ( 2 ) 

SPLINE = 0, EXPER > EXPER', 

where EXPER' is assumed to be exogenous and is given a value that 
ranges from 6 for high school dropouts to 9 for college graduates. 
The spline variable can be equivalently expressed as 

SPLINE = (l - - EXflR; W; 

l EXPER'/ v ' 

where D = l if EXPER ^ EXPER' and D = 0 otherwise. Substituting 
(3) into (1), the earnings equation becomes 

ln(WKEARN) = b 0 


+ 


+ 

+ 

Several restrictions on the estimated parameters now become obvious. 
The coefficients of EXPER*D and EXPER*COHORT*/> are re¬ 
stricted to be constant multiples of those of D and COHORT*!). Also, 
the coefficient of EXPER*COHORT is restricted 10 equal zero for 
workers with EXPER > EXPER'. Thus, the effects of cohorl size are 
allowed to vary during a worker's early years but are then forced to 
remain constant throughout the rest of the career. 

Perhaps just as troubling is the fact that the remaining variables are 
not permitted to have different impacts on the earnings of workers 
with experience less than and greater than EXPER'. For example, 
business-cycle influences are not likely to be equal for younger (EX¬ 
PER < EXPER') and older (EXPER > EXPER') workers. Also, since 
earnings levels differ by experience, it is likely that a given change in 
the proportion of the cell working part time has a different effect 
across experience groups. 


+ b t EXPER + fr 2 EXPER 2 + b,D 
b* 


r)' E 


EXPER 
/>r, COHORT*!) 

h 


EXPER*!) + 6 ,COHORT 


f)* e: 


EXPER*COHORT*/) 


(4) 


EXPER' 

MJNEMP + £ 7 YEAR + *„NOWORK 
MNCIMP + 6 „)PAR'l TIME + e. 
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While the restrictions imposed by Welch may have some intuitive 
appeal, it is another question whether these restrictions are supported 
by the data. A better approach might be to estimate earnings equa¬ 
tions with arid without the Welch restrictions and then test for the 
validity of the restrictions using standard statistical methods. 

A model that relaxes Welch’s specification restrictions, but main¬ 
tains his basic identification assumptions about experience, period, 
and cohort effects, is obtained by interacting/) with all of the included 
variables, adding a COHORT*EXPER variable, and dropping the 
cross-variable parameter restrictions. This is equivalent to estimating 
separate equations for groups of workers with less than or greater 
than EXPER'of the form 

ln(WKEARN,) = r/ 0 , + a u EXPER, + re,, EXPER? + a Sl COHORT, 

+ a.,,COHORT*EXPER l + a 5r UNEMP 

+ 0,„YEAR + NOWORK < 5 > 

+ onJNCIMP + a.,,PARTTIME + u„ 

where 1 = 1,2 for EXPER < EXPER' and EXPER > EXPER', respec¬ 
tively. 1 

The strategy of estimating the equations given by (5) and testing the 
restrictions implied by the Welch model is followed here. If the re¬ 
strictions inherent in (1) are rejected, then the analysis of cohort size 
effects must take place within the separate subsamples of younger 
and older workers. In other words, a rejection of the restrictions in (1) 
in lavor of equation (5) would imply different structures of earnings 
determination for workers with experience less than and greater than 
EXPER'. For purposes of analyzing the effects of cohort size within 
the baby-boom cohorts, this suggests a focus on the results obtained 
for the younger worker subsamples. 

I he data set used here is almost identical to that originally em¬ 
ployed by Welch (1979). The only differences are (1) the years 197fi— 
79 have been added to the original 19fi7—75 sample period, (2) there 
are only 34 instead of 44 experience cells in each schooling group, 
and (3) the NOWORK variable cannot be constructed. All of the 


4 Ail even more general model would include all possible second-order terms instead 
of ]ust EXPER 2 and COHORT*EXPER and then interact them all with D. The result¬ 
ing problems with colliriearity among the variables preclude this approach. Instead, 
more limited second-order models are specihed within each subsaniple (EXPER =5 
EXPER'), wiih the second-order terms restricted to the key COHORT and EXPER 
variables Welch's assumptions about the properties or the error term have also been 
retained. A more general formulation might incorporate serial or contemporaneous 
correlation, or even separate the error term imo experience, period, and cohort com¬ 
ponents using the latent variable approach suggested by Heckman and Robb (1983). 
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remaining variables, most importantly cohort size and experience, are 
constructed using the Welch (1979) definitions. 5 


IV. Results 

Initially, the Welch specification (eq. [1]) except for the NOWORK 
variable is estimated over his original 1967-75 sample period and 
from 1967—79 for each of the four schooling groups. Following 
Welch, the observations are weighted by cell frequencies of earners, 
and three different earnings equations are estimated: an annual earn¬ 
ings equation, and weekly earnings equations with and without a con¬ 
trol for part-time status. Table 1 presents a partial set of regression 
results for the college graduate schooling group. For each specifica¬ 
tion, column A in table 1 shows Welch’s cohort size and spline variable 
estimates, while columns B and C show the estimates obtained here. 
The estimates for the experience and other control variables obtained 
here are similar to Welch’s and are not reported since they are not 
crucial for the examination of cohort size effects. 6 

In equation (1), the entry cohort size effect is obtained by adding 
the COHORT and COHORT*SPLINE coefficients (b 4 and b 5 ), while 
the permanent effect is simply the COHORT coefficient (b. t ). The 
entry and permanent effects for college graduates estimated by 
Welch and those obtained here are all negative and in most cases are 
very similar in magnitude. Implications about the effect of cohort size 
on early career earnings growth can be obtained from the direction of 
the estimated COHORT*SPLlNE coefficient. If negative, as Welch 
finds for all but high school dropouts, then earnings grow at faster 
rates in larger cohorts since the initial negative effect diminishes with 
experience. The COHORT*SPLINE estimated coefficients are nega¬ 
tive in columns B and C of table I, consistent with Welch’s findings lor 
college graduates. 

While only the results for college graduates are shown in table 1, 
cohort size effects that are generally consistent with those of Welch 
are obtained for two of the other three schooling groups. In the case 
of high school graduates, positive COHOR I'*SPLINE coefficients are 
obtained, opposite Welch’s findings. For this group, apparently the 
addition of the thirty-fifth through forty-fourth experience year cells 


5 The data used here have been kindly provided by Finis Welch. The original data 
used in Welch (1979) are no longer available. 

B The only qualitative differences with Welch’s original results found here are that 
negative time trend coefficients (fc 7 in eq. f 1]) are estimated in a few cases when the re¬ 
gressions are updated to 1979. 
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is crucial for obtaining Welch’s results. 7 Overall, however, it appears 
that the data set used here is capable of fairly closely replicating 
Welch’s original results. 

Next, the earnings equations given by (5) are estimated and F- 
statistics are calculated under the maintained hypothesis that all of the 
restrictions implied by the Welch specification hold. Regardless of 
which schooling group, dependent variable, or time period is consid¬ 
ered, the restrictions of the Welch model are rejected at the .99 level. 
Even when the weak mean squared error test developed by Wallace 
(1972) is used, the restrictions are rejected at the .95 level in 23 out of 
24 cases. 8 Thus, the data appear to reject the Welch specification in 
favor of the less restrictive one given by equation (5). 

The F-tests indicate that separate earnings equations for workers 
with experience less than and greater than EXPER' are warranted. 
The estimates obtained for the subsample with EXPER < EXPER' are 
emphasized here since they permit a direct focus on the early career 
experiences of the baby-boom birth cohorts. In general, estimates for 
older workers are less interesting since these individuals have already 
been fully absorbed into the labor force and have made any necessary 
adjustments in their human capital investments in response to cohort 
size. Also, the likely presence of uncontrolled vintage effects (other 
than cohort size effects) makes it inadvisable to project the future 
experiences of the baby-boom cohorts based on the current experi¬ 
ences of older cohorts. This is less likely to be a problem within the 
relatively narrow band of birth cohorts covered in the younger 
worker subsamples. 9 

7 Alternatively, it is possible that the difference is caused by the absence <>( ihe 
NOWORK variable. However, this is not likely since NOWORK is either insignificant 
or only barely significant in Welch's schooling 12 earnings regressions. 

8 The only exception is found in the schooling 8-11 group when the weekly earnings 
model controlling for part-time status is estimated Irom 1967 to 1979 In this case the 
calculated F at 2.95 is below the weak MSE 95 critical value of 3 54 given in Goodnight 
and Wallace (1972). However, all of the other F 's are strongly significant using either 
test. Their calculated values range from 6.67 to 36 04. 

<J Recall that the key identifying assumption is that the only cohort effects on earnings 
are cohort size effects. While this may be plausible across narrow bands of entiy yeni 
cohorts, it may not be across wide bands. Any number of factors such as school quality 
not otherwise controlled for in the model may vary atioss wide bands of entrv year 
cohorts The "long" and “intermediate” swings of Easterhn (1968) and Wai liter (1977) 
are also consistent with cohorts defined over a range ol years. In fact, Easterlin (1980, 
p. 7) argues that a cohort or generation can be defined by an era, such as the 1950s, 
rather than by a single year. Heckman and Robb (1983) also suggest that ii may be 
plausible to define a cohort as a sequence ol adjacent years. If so, one way to control lor 
cohort effects other than cohort size is to estimate earnings equations loi relatively 
narrow bands of entry cohorts, in effect impounding these other cohort effects into the 
intercept. Of course, restricting the estimation to a narrow band of entry cohorts could 
present a problem for the estimation of cohort size effects since cohort size variation is 
reduced. This does not appear to be a serious problem in the sample used. 
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Estimates of (5) for college graduates with EXPER =£ EXPER' are 
reported in table 2. While not readily apparent, concave earning, 
profiles are found in each case, even when the EXPER coefficient is 
negative. By definition, COHORT is always negative since it is the log 
of a number between zero and one. Thus, COHORT*EXPER com¬ 
bined with the negative estimate of a.,\ make a positive contribution to 
the experience effect on earnings. 10 In fact, their joint contribution 
swamps that of the EXPER coefficient (an) over the range of 
COHORT values in the sample, assuring the usual concave shaped 
earnings profiles. 

Cohort size effects on earnings are negative upon entry (a 3 | < 0), 
consistent with past research. However, the negative effects on earn¬ 
ings increase with experience as is indicated by the negative co¬ 
efficient on COHORT*EXPER in all six regressions reported. Similai 
cohort size effects are estimated for the other three schooling groups 
Entry effects are negative or near zero in almost every case. 1 Ovet 
the original 1967—75 sample period, the estimated COHORT* 
EXPER coefficient is negative in every instance. When the estimates 
are updated through 1979, COHOR'CEXPER has a positive (but 
insignificant) coefficient in just two out of nine regressions. Taking 
all four schooling groups together, the estimated COHORT* 
EXPER coefficient is negative in 22 out of 24 regressions anc 
significantly different from zero at the .95 level in 14 out of 22. Thus 
overall, the evidence suggests that larger cohorts experience slowei 
early career earnings growth and flatter earnings profiles than dt 
smaller cohorts. 'This is directly opposite the conclusion reached b; 
Welch. It appears that the imposition of the restrictions in equatior 
(1) leads to implications about cohort size effects on earnings growtl 
that are opposite those from the less restrictive model given by equa 
lion (5). 

One way to check the robustness of the findings presented here is t< 
alter the value of EXPER'. In terms of Welch's theoretical model thi 
corresponds to the level of experience at which the individual be 
conies a fully vested worker. Although Welch chooses values fairl 


111 I he fJ.Jtd.il derivative of in(WKEAKN) with respect to experience m eq (5) to 
young workers is 


d In(WKF.ARN) 
dLXPER 


+ «|,C()H()R I 


2^,EXPF.R. 


Since the estimated flu’s and (lOHORI are both negative, they combine to make 
positive contribution to earnings 

11 In only three out of the 18 regressions for the other three schooling groups is 
positive and significant (at the 95 level in a two-tailed test) entry cohort size effei 
found. But even in these cases, the cohort size effect turns negative very early m th 
career, usually 1—2 years after entry. 



TABLE 2 




observations tn col B (13 sears * M experience groups) for each dependent variable Absolute values of f-staiisiics are sho*n in parentheses 
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early in the career, one of the effects of cohort size may be to slow the 
transition between learner and worker. Therefore, the whole proce¬ 
dure described above has been repeated with values of EXPER' equal 
to 10, 12, and 15 years in each schooling group. In each instance, 
qualitatively similar results to those already reported in table 2 are 
obtained. The restrictions implied by the Welch model are always 
rejected at the .99 level. And as before, the less restrictive models lead 
to the conclusion that increases in cohort size lead to slower rates of 
earnings growth within the younger worker subsamples. 12 

V. Summary and Conclusions 

This study reexamines the theory and evidence surrounding the ef¬ 
fect of cohort size on earnings growth. The initial treatment of this 
topic by Welch (1979) finds that while cohort size depresses earnings 
at entry, these negative effects rapidly diminish and reach a smaller 
permanent level at a relatively young age. Thus, cohort size has a 
positive effect on early career earnings growth. 

Welch’s theoretical framework is found to be fairly restrictive in 
that it assumes an exogenous speed of transition from the learner to 
the worker phase. It is also shown that Welch’s empirical model im¬ 
poses several restrictions. Using data almost identical to those em¬ 
ployed by Welch, the restrictions inherent in his empirical model are 
rejected in favor of a more general model, which involves separate 
earnings equations for older and younger worker subsamples. In or¬ 
der to obtain evidence on the early career experiences of the baby- 
boom cohorts, the analysis here focuses on the results obtained for the 
younger worker subsample. Within this subsample, cohort size gener¬ 
ally has a negative effect on earnings levels, but also appears to slow 
down earnings growth. This latter finding is directly opposite the 
conclusion reached by Welch. 

What implications can be drawn from these results? Cohort size 
effects on earnings levels appear to widen with experience, suggesting 
a continually increasing cohort size earnings “penalty” as workers in 
large cohorts move through their careers. This suggests that there will 
be no quick recovery of the earnings levels of workers in large entry 
cohorts as is implied by Welch’s study. At the very least, the lower 
observed rates of earnings growth in large cohorts are consistent with 


12 With EXPER’ equal to 10 and 12 years, negative (X)HOK I '‘EXPER coefficients 
are estimated in 21 out of 24 regressions with 16 being significantly different from zero 
at the .95 level. When EXPER' is set at 15, the COHORT*EXPER parameter estimate is 
negative 19 out of 24 times, with 12 being significantly different from zero at the .95 
level. 
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slower speeds of transition between the learning and fully trained 
stages of the career. 

It should be stressed, however, that the results obtained here come 
from models that relax only Welch’s specification restrictions, while 
maintaining his assumptions about the identification of experience, 
period, and cohort effects. It is possible that other studies with differ¬ 
ent identification assumptions, or which incorporate cohort effects on 
earnings other than cohort size effects, might come to different con¬ 
clusions. But available evidence through the 1970s appears to suggest 
slower earnings growth in larger cohorts. 
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The Role of Taxes and Social Security in 
Determining the Structure of 
Wages and Pensions 


Gene E. Mumy 

Okiu State {'mi'mity 


The model in this paper attempts to resurrect the tax structure 
explanation ol the use ol pensions. In recent years this explanation 
has been questioned on the grounds that IRAs provide an equivalent 
tax shelter for retirement income and that the tax explanation is 
inconsistent with observed characteristics ol pension plans. Il is 
shown here that when payroll taxes anti social security benefits, as 
well as income taxes, ate taken into account, pensions and IRAs are 
not substitutes but, rather, plav complementary roles. On this basis, 
it is also shown that the lax explanation of pensions is consistent with 
observed pension plan characteristics. 


I. Introduction 

The position that pension lax advantages tor retirement saving pro¬ 
vide a satisfactory explanation of private pension provision by busi¬ 
ness firms has tome under heavy attack in recent years. On one front 
it is maintained that pensions simply do not have a unique tax advan¬ 
tage because the Individual Retirement Account (IRA) provisions of 
the Employee Retirement Income Security Act of 1974 (ERISA) pro¬ 
vide an equally good alternative to pensions for sheltering retirement 
savings from income taxes (Logue [1979] and Tepper [1981] are ex- 


1 would like to thank the U.S. Department ol Laboi lor supporting some of the initial 
work on this topic under contract J-9-P-0-0158. Valuable comments and suggestions 
were made by Richard Cantor, Richard Jensen, Donald Parsons, and other members ol 
the Applied Price Theory Workshop at Ohio Stale University. William Manson has 
been particularly helpful during a lengthy period oi collaboration on pension topics. 
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amples). On a second front it is maintained that the tax advantage 
explanation of pensions is inconsistent with an increasingly evident 
empirical characteristic of many pension plans. 

Several studies recently have concluded that the expected present 
value of an employee’s pension stream tends to decrease as retirement 
age increases (see, e.g., Urban Institute 1982; Lazear 1983; Mitchell 
and Fields 1983). Lazear, in particular, has argued that this empirical 
evidence shatters the tax advantage explanation. After characterizing 
pension tax advantages as simply “a way to save at before-tax rather 
than after-tax rates of interest," Lazear continues, “Although there 
must be some truth to the notion that pensions function as a tax-free 
savings account, this view alone is inconsistent with the finding . . . 
that the expected value of the pension stream declines with increased 
age of retirement. Since nothing is withdrawn explicitly from the 
account until retirement, the value of pension benefits should be 
strictly increasing with age of retirement under the savings account 
interpretation of pensions" (1983, pp. 57—58). 

In light of the apparent success of these attacks on the tax advan¬ 
tage explanation, several alternatives have been proposed, the most 
notable of them falling under the rubric of agency theory (see Lazear 
1979, 1981, 1983; Logue 1979). This paper, however, is mainly con¬ 
cerned with the tax advantage explanation of pensions rather than 
with the agency theory alternatives. The goal is to show that with a 
rich enough conceptualization, tax advantages alone constitute a 
sufficient reason for the existence of pensions and that the evidence 
so far produced to oppose this view falls short of its mark. In Section 
II, a life-cycle model of employee wages and pensions that takes ac¬ 
count of payroll taxes and social security benefits, as well as income 
taxes, is constructed and analyzed with respect to its implications for 
employee wage profiles and the use of pensions versus IRA savings. 
In Section III, further implications of this model are compared with 
current empirical evidence on some important characteristics of em¬ 
ployee pension plans. Section IV contains a discussion of other possi¬ 
ble implications of the model and a presentation of conclusions. 

II. The Tax Advantage Model of 
Wages and Pensions 

Before dealing with the tax advantage model, I must say a few words 
about how it differs from agency theory models. At the heart of 
agency theory models is the notion that the interests of employees 
diverge from those of employers; In particular, after receiving costly 
training an employee may want to quit before the firm has recouped 
the cost of training, or, depending on the extent of costly monitoring 



57® 

and assignment 


JOURNAL OF POLITICAL ECONOMY 

a , n employee may engage in a suboptimal 
ol rewards, a P , f compensation and 


<raS ’i Jution is to defer compensation and 
level of shirking. A partial w tuIllty tost of quitting or shirk- 
thereby increase die employee PP ^ As a result, the em- 
ing. with its attendant prospetts . ased value can be advanta- 
pioyee’s net productivity rises an fc point, then, is 

geousiy shared by the employee ^ t0 incr ease. 

that deferred compensation causes p , h possibility of 

Th, p— ,,l» » W" ■£* llJJnceofsuch 

such agency incentives. My point is that, v „„,itnie a sufficient 

considerations, taxes and social security a one c< character- 

explanation for the existence of pensions and P^ 'on P ^n character 
istics. In the model to which I now turn, this -s analytically reflected by 
the assumption that the employee’s productivity profile does not de¬ 
pend on the time profile of compensation. 

In order lo construct a benchmark model of tax advantages, it is 
initially assumed that an employee’s entire working career is spent 
with a single employer over the time interval (0, «)■ After retiring at t 
= the employee lives for an additional L - R years. It is also 
assumed that competitive pressure causes the firm to choose the com¬ 
pensation profile that maximizes the present value of the employees 
disposable (i.e., net of taxes and IRA contributions) life-cycle income, 
such that the present value of payments made on behalf of the em¬ 
ployee is equal to (he present value of the employee's marginal pro¬ 
ductivity profile. To achieve this, the firm must take account of in¬ 
come taxes, payroll taxes, social security benefits, and the optimal 
profile of IRA contributions, all of which depend on the amount and 
time profile of compensation. 

Formally, the problem is 

max C = j[ [(1 - r\)W{t) - /</) - T(W - t)]e~‘'dt 

(1) 

+ [/' + S - T(P + ,S) + B] e~~ dt, 

m 

subject to the competitive constraint 

[ V{t)e "dt = f (I + r 2 )W(l)e ‘'dt + P ( e “dt, 

Jo J(l Jft 

which can be written in a more revealing form as 

p = Ar7 f* [V(t) - (1 + r,)W(t))e-‘dt 3= 0 (2) 


1 Parsons ( 1984 ), on the other hand, argues that (his type of model has significant 
flaws. 
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B = B(E, R), (4) 

W ?0;()s;« /, 


where V(<) = the time profile of the value of the employee’s marginal 
product; W(t) = the wage profile; /(£) = the profile of IRA contribu¬ 
tions; / = maximum allowable IRA contribution; T( ) = income 
taxes; ri, r% = employee and employer payroll tax rates, respectively; 
P = the level stream pension benefit; S = the level stream retirement 
benefit obtained from IRA savings; B = the level stream social secu¬ 
rity benefit; E = [/t? W(t)dt]/R (average working-life earnings); i = a 
positive interest rate; and Am = e~"dt. 

This formulation of the problem clearly incorporates the important 
tax and social security considerations. 2 Pension accruals and ac¬ 
cumulated IRA contributions are both sheltered from income taxes 
until they are actually realized as retirement income. Pension accruals 
also shelter employee and employer from payroll taxes, while payroll 
income destined for an IRA exposes both to payroll taxes. The other 
side of this coin, however, is that pension accruals are not counted in 
the average earnings base for social security benefits, while IRA con¬ 
tributions from payroll income are counted. Finally, social security 
benefits are represented as being completely tax free, which is a 
minor falsification of current reality. In this setting, the tax and social 
security benefit structures are of the utmost importance and need to 
be briefly considered. 

The U.S. income tax structure is characterized by positive and in¬ 
creasing marginal tax rates as taxable income increases. Therefore, 
we have T' > 0 and T" > 0. Payroll taxes, on the other hand, are 
incurred by both employee and employer at a constant marginal rate 
up to some maximum amount of annual wages, currently (1984) over 
$35,000 a year. Since about 90 percent of payroll income is subject to 
the payroll tax (see Chen 1981), the maximum limit is ignored as it 
will also be in terms of computing the base for social security benefits. 
As for social security benefits, they depend, most importantly, on 
average earnings over the employee’s working career and age of re¬ 
tirement. For the time being we will largely ignore retirement age and 
concentrate on the benefit structure with respect to average earnings. 
In this regard we have B K > 0, B EE < 0, which represents diminishing 
marginal benefits as average earnings increase. 


Mumy and Manson (1983) discuss these considerations at greater length 
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We can now proceed to an analysis of our maximization problem, 
which is cast in an optimal control framework. Hie problem is to find 
the trajectories of the control variables, W(t) and /(/), that maximize C, 
in equation (1), while satisfying the constraints. An appropriate 
choice of the state variables allows the second term in equation (1) to 
be treated as a salvage term, the value of which depends on the 
terminal values of the state variables. Suitable state variables are 


X, = AJ t ,\ f (V’(.v) - (1 + n>)W(.s))r T/v, 
h 

X> = A hi I Hs)e "ds. 

Jo 

X< = R 1 W(.s)c/.s, 

Jo 

with the result that P — X,(/{), S = X ->(/?), R — X-s(R), and, hence, R = 
BlX-i(R), R |. After some substitutions and simplifications, the usual 
operations on the Hamiltonian and salvage term yield the following 
conditions for the extremals, W*(/) and /*(/): 


s/i 

aw 


= r "{ T'(P + .S') - T(W - /) - r, 


- r,[l - T(P + .S)j} + Ah,.R 'lh- = 0; 


a// 

a/ 


\T(W -I) - T(P + S’)] § 0: 


W > 0; 

A 


f I 

/ = ? . 

lo 


( 5 ) 

( 6 ) 


The intuitive content of conditions (5) and (6) is reasonably trans¬ 
parent and leads almost immediately to some important implications. 
We begin by tonsidering the issue of using pensions and/or IRAs. 


A. Pensions versus IRAs ' 

The first thing to notice is that pensions and IRAs would be equiva¬ 
lent tax shelters, with limits on IRA contributions ignored, if there 
were no payroll taxes or social security benefits. In this case, equation 
(5) reduces to 

T’(P + .S’) - T’(W - I) = 0, 


’ In tins section it is assumed that both pensions and IRAs can be used. Tins is 
consistent with lilt 1KA provisions of the Economic Recovetv Tax Act of HIM, which 
made IRAs available to everyone, regaiclless ol pension plan participation III ihc initial 
ERISA provisions. IRAs were available only to individuals not participating m qualified 
pension plans, meaning dial an all or nothing (hoice bad to be made between pen¬ 
sions and IRAs. It is my presumption that only (airly low-income individuals would 
have preferred IRAs under these circumstances. lor reasons that should be clear from 
the argument in the text. 
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which is equivalent to equation (6), meaning that there are not unique 
solutions for W and /. The reason is that identical changes in W and /, 
at any point in time, leave taxable income at that point unchanged 
while generating equal but opposite changes in pensions and IRA 
retirement benefits, thus leaving taxable retirement income un¬ 
changed. But this equivalence vanishes in a real-world setting because 
pensions and IRA contributions are affected differently by the pres¬ 
ence of payroll taxes and social security benefits. 

Income-tax-sheltered pension accruals avoid payroll taxes, but at 
the cost of not being counted in the average earnings base for social 
security benefits. Just the reverse is true for income-tax-sheltered 
IRA contributions. While it might be the case that there are some low- 
income individuals, with low average earnings and high marginal 
social security benefits, who want all of their compensation to enter 
the wage base rather than receiving pensions, this is not likely to be 
the case generally. Because of the declining marginal benefit struc¬ 
ture of social security, there is some margin where most employees 
and their employers prefer to forgo additional benefits in order to 
avoid additional payroll taxes, and thus take some compensation in 
the form of pensions. Since for most people marginal social security 
benefits are less than the average benefit per dollar of average earn¬ 
ings, this conclusion follows a fortiori from the empirical evidence, 
which suggests that currently most people can expect, at best, to break 
even in terms of the present value of social security benefits relative to 
the present value of employee and employer payroll taxes (see 
Thompson 1985, pp. 1454—59). ' The use of pensions, however, does 
not mean that IRAs should not be used. 

The exact nature of IRA usage will become clearer when we con¬ 
sider the nature of the wage profiles that emerge from this structure 
of taxes and social security benefits. For now, however, it is useful to 
examine the pattern of IRA usage implied by conditions (5) and (fi). 
If equation (5) is to be satisfied at every point in the (0, R) time 
interval, the effect of discounting requires that the algebraic value of 
[T'{P + S) — T'(W — /)] be decreasing over time. Of course [T'(W - 
I) — T'(P -f 5)j in condition (6) must be increasing over time, mean¬ 
ing that dH/dl $ 0 when t S (. /, then, is a bang-bang control (see 
Kamien and Schwartz 1981, pp. 186—92) and takes its lower-bound 
value, zero, when t < i and its upper-bound value, /, when t > t. I'he 


4 Of course, this does not mean that the use of pensions depends on social security 
benefits yielding a low average return on payroll taxes. Indeed, the average return can 
be quite substantial, as is the case for the current generation of retirees, and pensions 
will still be used js long as the structure of " »■:. '• ■ -t . benefits, with respect 
to average earnings, generates a margin w-■ ; • ., i . 1 ■-i . benefits are less than 
the amount of additional taxes. 
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implication is that IRAs should be used later rather tha&ctrifa s -^ nd 
except at i, they should be used to the maximum allowable extent or 
not at ail. 5 The intuitive sense behind the bang-hang nature of IRa 
contributions exposes some important forces at work in the model, so 
it is worth exploring a bit further. 

The results so far indicate that there should not be more than a 
single point in time, t, for which 0 < /(f) < /. Let us assume the 
contrary and suppose that /1 and / 2 are the IRA contributions at /, and 
1 2 , respectively, with f, < to- Also suppose that VF, and lVj are the 
respective wage payments at these two points. As long as/, > 0 and /., 
< /, both employee and employer can be made better off by decreas¬ 
ing/, and W } by $100 each and increasing 4 and W 2 by $1.00 each. 
The reason is that taxable income, W - /. remains unchanged at each 
point in time; average earnings and social security benefits remain 
unchanged because the increase it) W, compensates for the decrease 
in W, in each period the change in IRA contributions is matched by 
an equal c hange in wages, generating an equal but opposite change in 
pension accruals, thus leaving (P + A’) unaltered; and, finally, em- 
ployee and employer save rj and >‘ 2 , respectively, in payroll taxes at /* 
but incur an additional r\ and /■> in payroll taxes at t 2 . But the present 
value of the* additional payroll taxes at /•» is less than the present value 
of payroll tax savings at /,, so employee and employer both benefit 
from the deferral of payroll taxes. 

The key to the intuitive sense of why IRA contributions are bang- 
bang is that by deferring wages the present value of payroll taxes can 
be decreased while social security benefits are unaltered. The reason 
for the latter result is that average earnings are computed on the basis 
of undiscounted wages. From the standpoint of payroll taxes, the time 
profile of wages matters, while from the standpoint of average earn¬ 
ings it does not matter. I his is also an important force in determining 
the shape of wage profiles, to which we now turn. 


R. Wage Profiles 

It was pointed out above that equation (5) requires [T'(P + S) - T'(W 
— /)] to be algebraically decreasing over time. But T'(P + S) is a 
constant with respect to t, so this actually requires that T'(W - /) be 
increasing over time. Furthermore, since 1 = 0 over the (0, t) time 
interval and then jumps up discontinuously to continue at / = / over 


5 This is an important empirical implication of the model, but it is contingent on the 
employee’s obtaining optimal pension provision. Pension plans, however, are for 
groups of employees and may not be tailored perfectly for each person, so this implica¬ 
tion must be qualified Nevertheless, this implication is probably worth some empiri¬ 
cal exploration. 
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the (i, R) interval, it follows that W must increase continuously over (0, 
t), then jump up discontinuous^ at t, and then conunuously increase 
over (t, R). What we have here is a tax and social security explanation 
for wage profiles that are increasing with respect to time, 6 regardless 
of the employee’s productivity time profile. 

This outcome represents a balance between two counteracting ten¬ 
dencies. On one hand, the progressive marginal rate structure of the 
income tax generates pressure for a level compensation profile. On 
the other hand, payroll taxes and the structure of social security 
benefits generate an incentive to defer wage compensation toward the 
end of working life. The reason for this latter incentive is that as a 
dollar of wage income is deferred, the present value of the payroll 
taxes paid on that dollar declines, but an earlier or later dollar of wage 
compensation enters the calculation of average earnings for social 
security in an undiscounted manner. 

We are now in a position to appreciate fully why pensions and IRAs 
are not substitutes but, rather, play complementary roles. T he social 
security and payroll tax incentive is to defer compensation early in 
working life and thus avoid payroll taxes and allow pension benefits to 
accrue, while the concentration of compensation in the latter part of 
working life builds up the average earnings base lor social security 
benefits when the present value of payroll taxes is lower. This incen¬ 
tive is mitigated, however, because the progressive income tax rate 
structure causes concentrated earnings to be taxed at higher marginal 
rates, thus increasing the average income tax rate paid over the indi¬ 
vidual’s working life. This effect is lessened by the use of IRA contri¬ 
butions in the latter part of working life to provide an income lax 
shelter for the concentrated earnings that are exposed to the payroll 
tax and enter the social security benefit base. 

Having shown that pensions and IRAs have different but com¬ 
plementary advantages, and that together they are consistent with an 
upward-sloping wage profile, I turn to pension plan characteristics. 

III. Pension Plan Characteristics 

A. Retirement Age and the Value of the Pension Stream 

Since taxes and social security benefits dictate an upward-sloping 
wage profile, regardless of the productivity profile, it is entirely possi¬ 
ble, even likely, that the employee's marginal product is less than 
employer payments (wage and payroll tax payments), as retirement 

6 Empirical evidence suggests that wage profiles are basically increasing (sec. e.g., 
Heckman 1976), this also being a crucial factor in l.azear's agency explanations of 
mandatory retirement and pensions. 
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age is approached. This is the situation illustrated in figure 1. which 
has a flat productivity profile for convenience. 

If the employee retires at R, the present value of the level pension 
stream, /’, over the interval (R, L) is equal to the present value of the 
area between V and employer payments, curve EE, over (0, /,) minus 
the present value of the area between EE and V over (/ 1 , R). 7 If the 
employee retires at an earlier date, like/^, then the interval over which 
employer payments exceed productivity is shortened and the present 
value of the pension stream should be higher. For entirely different 
reasons than those usually given, we have arrived at the result that the 
present value of the pension stream decreases as retirement age in¬ 
creases. 

B. Employee Tenure, Retirement Age. 
and Pension Value 

Lazear (1983) has presented evidence that the decrease in pension 
value as retirement age increases is smaller for employees who have 

7 Since it adds nothing to the graphical analysis, the possible discontinuity in EF. that 
might arise because ol IRA contributions has been omitted. 



TAXES AND SOCIAL SECURITY 583 

less tenure with the firm. Again, this is perfectly consistent with the 
tax advantage model of pensions. 

To see this, suppose an employee joins the firm at some date after t 
= 0 but has the same productivity profile and expected retirement 
age as an employee who joined the firm at t = 0. In terms of figure 1, 
suppose the employee joins the firm at t\, to take the most dramatic 
case. Employer payments for an employee who has been with the firm 
since t = 0 will be on curve EE. It is clear, however, that the firm 
cannot make payments along EE over the interval (r ( , R) for an em¬ 
ployee who joined the firm at fi, because working-life payments would 
be greater than productivity, necessitating a negative pension. As a 
result, the firm will only be willing to make payments along some 
lower curve like E'E'. Since in this case employer payments exceed 
productivity by a smaller amount as normal retirement is approached, 
the value of the pension stream decreases by a smaller amount than 
for an employee with longer tenure and the firm making payments 
along EE. 

Interestingly, the length-of-tenure case illustrates the superiority of 
long-term contracts over a continuous auction market for labor. Pay¬ 
ment profiles like EE, in figure 1, cannot be sustained in the continu¬ 
ous auction market setting because firms would not be willing to make 
payments greater than the employee’s productivity. This also means 
that the tax and social security incentives for pensions and an upward- 
sloping wage profile limit employee mobility even in the absence of 
such devices as delayed vesting of pension benefits. Even if accrued 
pension benefits are fully vested, a job change prevents an individual 
from following the compensation profile that maximizes after-tax 
wealth. 

C. Income, Retirement Age, and the Value of Pensions 

Lazear has also presented evidence that the decrease in pension value 
as retirement age increases is greater for higher-income employees 
than for lower-income employees. This evidence is consistent with the 
tax advantage model of pensions if one compares employees that 
have productivity profiles at different levels but with the same curva¬ 
ture and, of course, assumes the same tenure and normal retirement. 

Basically, higher-income employees should have a more steeply 
sloped wage profile than lower-income employees. The main reason 
for this is that higher-level wage profiles expose both employee and 
employer to higher payroll taxes, while the marginal social security 
benefits from higher average earnings diminish. There is a greater 
incentive, therefore, to defer compensation for higher-income em¬ 
ployees to reduce even further the present value of payroll taxes in 
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light of the reduced marginal benefits to additional earnings, This 
incentive, of course, is moderated somewhat by the progressive in- 
come tax rate structure. Under these circumstances the associated 
employer payments profile will exceed productivity by a greater 
amount, at normal retirement, for higher-income employees. Early 
retirement should generate a larger increase in pension value than 
would be the case for a lower-income employee. 


IV. Other Implications and Conclusions 

n, „„M uresemed in d* l»pei *arl> im in.plic.cioi,, . beyond 
m do with pension characteristics. Perhaps the most 
!, ^ - *** - 

di c ::!z^Zr «»- *i"» “ uc ' b - 

market setting without private pensions (e.g.. Brittain 1971; Hamer- 
mesh 1979). The impact of the employers share of the payroll tax on 
equilibrium wages is (hen addressed solely in terms of labor supply 
and demand elasticities. A cursory glance at the mode) in this paper 
indicates that both employer and employee payroll tax rates influence 
the level and profile of wages by inducing a shift from wages to 
pensions, in addition to cutting into the employer’s demand price and 
the employee’s supply price. 

Along these same lines, theie is some question about the extent to 
which sound financing of the social security system can be assured by 
raising payroll taxes and/or reducing benefits. Chen (1981) has shown 
the relative erosion of the payroll tax base as pensions and other 
fringe benefits have grown in importance. In addition to the usual 
erosion implied by shifting of the employer’s share of the payroll tax, 
the model in this paper clearly links pension growth and wage-base 
erosion to the employee’s share of the payroll tax, income taxes, and 
the level of social security benefits. All of these factors deserve more 


attention in analyses of social security reform. 

As it turns out, the tax advantage explanation of pensions is not 
only logically and empirically sound, it also has implications for other 
important current issues. It is also interesting to note that the com¬ 
pensation profiles generated by tax advantage considerations entail 
an agency incentive, as was seen in Section lllfl. In short, tax advan¬ 
tages are as serviceable as ever for explaining the provision of pen¬ 
sions. 
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Testing Hypotheses about Wrm '; 
Behavior in the Cigarette Induatry 
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Daniel Sullivan 

Prim rlon t im'fiuty 


Dm oil iht efleets of excise taxes art- used to investigate the level of 
competition in fhe ctg.uctle itulustn. A I own hound on the num- 
hers enima/em of fit ins m die <onie\i of tfte conjectural variations 
model is derived and estmiaied. Results point to at least a moderately 
h j L r>, !cie l of competition and. in partit ttlar. allow for the rejection of 
the pet feet t artel model Tins paper relies on comparative statics 
results and hence arcaliments ret e.i. critR isms of a study of monop¬ 
oly behavior m the ngaielie industn. 


I. Introduction 

In an article in this Journal. Sunitret (1981) uses price and excise tax 
data ft) estimate the degree of monopoly power in the cigarette indus¬ 
try 1 Mis results, which do not make use of arty possibly unreliable cost 
data,* indicate that while the hypothesis of perfect competition is 
untenable, so is that of an effective cartel. Iri a retent note, however, 
Bnlow and Pfleiderer (1983) call the latter of these results into ques¬ 
tion by demonstrating that Sumner's methodology is extremely sensi- 


Thanks arc owed to Oiley Ashcnfeltcr, David Card, Richard Quandt, David Knhin, 
Geotge Stigler, Daniel Sumner, an anonymous referee, and especially In Robert Willig, 
who supervised my second-year paper at Princeton t.’diversity, of which this is art 
extension. Partial support from the Sloan Foundation is gratefully acknowledged. 

1 For some of the history of the cigaicite industry, see Nic hulls (1951), Telser (1962). 
and Schmalensee (1972). 

“ Other approaches to estimating the degree of monopoly power without using cost 
data include Biesnahan (1981) and Bakei and Bresnahan (198-1). lwata (1974), Appel- 
baum (1979. 1982), and Coltop and Roberts 11(179) are among those who use tost data 
for similar purposes. 
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tive to assumption* about the functional form of the demand curve. 
This paper attempts to establish the absence of a cartel by employing 
a methodology that relies on much weaker assumptions than those 
criticized by Bulow and Pfleiderer. 

The present analysis is inspired by a paper by Rosse and Panzar 
(1977). They show that simple comparative statics results suffice to 
generate testable predictions of the monopoly model. Here, the com¬ 
parative statics of the conjectural variations model are shown to imply 
that a certain quantity, relating to the response of price and output to 
variation in the tax rate, is, under the most minimal assumptions 
(negatively sloped demand curve and positively sloped total cost 
curves), a lower bound on the numbers equivalent of firms. That is, it 
is shown that if this quantity is equal to some number n, then, in a 
well-defined sense, the industry must be at least as competitive as n 
firms playing a quantity Cournot game. Estimation results for the 
cigarette industry point to at least a moderately high level of competi¬ 
tion and, in particular, allow for the rejection of the perfect cartel 
model. Application of the methodology to other industries with varia¬ 
tion in an observable component of marginal cost should be possible. 

Section II of this paper derives the lower bound mentioned above 
from the comparative statics of the conjectural variations model. Sec¬ 
tion III discusses the statistical implementation. Results are in Section 
IV and conclusions in Section V. 

II. Comparative Statics 

Assume that there are n producers of a homogeneous good, that they 
have cost functions ..., and that market price is given 

by the inverse demand function P{Q), where Q = £(>, is the total 
industry output, Then if the excise tax is /, the uh firm's profits are 
Qj[P(Q.) — t) — L’,(Q,). Assume further that each firm maximizes its 
profits subject to its perceptions of how other firms' outputs will re¬ 
spond to its own. These perceptions are summarized by the quantities 
a, = dQj/dQj, which rnay depend on the vector of outputs and the 
tax rate and which are referred to as the firms' conjectures. 

One quantity that arises in a natural way in the analysis of this 
model is £ 1/(1 + a,), the numbers equivalent of firms. In the Cournot 
case where a, = 0 for all i, it is indeed just the number of firms. 
Moreover, in a perfectly competitive market, the a, —* — I and 21/(1 
+ a,) — * oo, while a perfect cartel would imply £1/(1 + a,) = 1. Thus 
the numbers equivalent is a meaningful scale on which to measure the 
level of competition in an industry. 

Let qi(t), • ■ • , q„(l) denote equilibrium levels of output when the tax 
rate is t. Further define q{t) = £ q,(t) and p{t) = P[q{l)]- Finally, let 
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a,(t) denote the value of the ith firm’s conjecture when outputs are 
9l (f), .... q „(t) and the tax rate is t. The first-order conditions for 
profit maximization imply that 

P[q(t)] + + «,<<)] - t = C',[qXt )] (1) 

for all i and t. 

If marginal cost is above some level, c, for all firms in the industry, 
then it follows from (1) that 


p(t) ~ t~ c 

1 + a ,(0 


+ <7,(0 


p'(0 

qX 0 


> o. 


( 2 ) 


Summing this relation over i shows that 

- 1 - a n*(c, t) - — —'^ , —. (3) 

1 + 0,(0 [p(t) - t - c]q (t) 

That is, if all marginal costs exceed r, n*(r, /) is a lower bound on the 
numbers equivalent of firms.Put differently, if p(t) and q(t) are such 
that n*(c, t ) = n, then they could not have been generated by any less 
than n firms playing a quantity Cournot game if each had marginal 
costs of at least c. 

For later reference, it is worth noting that if e(Q) = — P(QVQPXQ) 
denotes the elasticity of market demand, then (3) can be written as 



e[<7(0) a 


_ Pi 0_ 

f/KO ~ ~ f] 2 1/(1 + a,)’ 


(4) 


which is an extension of the familiar rule that a monopolist will not 
produce on the inelastic portion of his demand curve. 

Estimates of p(t) and q(t) and the derived quantity n*(c, t ) are pre¬ 
sented below for the cigarette industry. It turns out that even the 
harmless assumption that marginal costs are always positive leads to 
nontrivial conclusions about the numbers equivalent and, in particu¬ 
lar, allows for the rejection of the perfect cartel model. 


III. Statistical Implementation 

Data 

The data used in this study cover the years 1955-82 and are taken 
from 45 states . 4 The source for all three variables is the Tobacco Tax 


It is not hard to show that without further assumptions about tunciional form this 
condition essentially exhausts the information in the first-order conditions: If (3) holds 
and in addition p‘(t) 2 0 and q'(t) s 0, then there exist cost and demand functions with 
the usual slopes such that the first-order conditions (1) hold. The inequalities above can 
all be reversed to show that if all marginal costs are less than c, then n*(r, <) is an upper 
bound on the numbers equivalent. 

4 Data from New Hampshire and Hawaii were not used because until recently those 
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Council’s publication, The Tax Burden on Tobacco (1982). The price 
variable is their “weighted-average price per package,” the output 
variable their “per capita sales,” fl and the tax variable a weighted 
average of the rates (with weights proportional to the length of time 
the rate was in effect) they show as having prevailed in each state and 
year. The two dollar-denominated variables were converted to real 
terms by dividing by the National Consumer Price Index. Observa¬ 
tions with missing data were dropped, leaving a total of 1,223 state- 
year pairs to use for estimation and testing. 6 

Univariate statistics are shown in table I. 

These data are used, below, to estimate the reduced-form functions 
p(t) and q(t). It is assumed that each state-year pair is an independent 
market and that variation in cost and demand conditions between 
these markets is accounted for by the analysis of covariance tech¬ 
niques described below. 

The independence assumption should be regarded as an approxi¬ 
mation. Sumner notes that legal arbitrage activity by cigarette dis¬ 
tributors is unlikely to be profitable because the institutional setting 
makes this behavior impossible to hide and thus likely to be punished 
by manufacturers. However, illegal bootlegging of cigarettes from 
low- to high-tax states is potentially very profitable, and numerous 
observers have remarked on the inability of law enforcement officials 
to eliminate it completely. 7 Nevertheless, the present study assumes 
that such effects are empirically unimportant and can be safely ig¬ 
nored. 

It should also be mentioned that since retail rather than wholesale 
price data are used here, all results, strictly speaking, apply to the 
combined system of cigarette manufacturers, distributors, and retail¬ 
ers. However, the large number of the latter two types of agents 
makes quite plausible Sumner’s assumption that any deviation from 


states charged an ad valorem sales tax rather than a per unit excise tax Data from the 
major cigarette-producing states, North Carolina, Kentucky, and Virginia, were 
dropped because the relationships among the variables for those stales seemed 
sufficiently different to make their inclusion unwise. Similarly, data from the District of 
Columbia were not used because of certain anomalies traceable to the close connection 
of its economy with those of Virginia and Maryland 

5 Note that the value of n*(c. 1) is unchanged if per capita consumption replaces total 
output in (3). 

“ Sumner also obtained his data from the Tobacco Tax Council and describes more 
fully their collection procedures. Differences between his data set and the one em¬ 
ployed here are that (i) the present study makes use of 4 additional years of data, ( 11 ) 
Sumner inctuded data from North Carolina, Virginia, Kentucky, and the Districl of 
Columbia; (iii) he did not use data from Alaska; (iv) he did not make use of the quantity 
data and thus did not have to drop those observations for which the quantity variable 
was missing; and (v) he used year-end tax rates in years in which more than one rate 
prevailed. 

7 George Stigler has pointed out that this smuggling may be less than vigorously op¬ 
posed by the manufacturers since it helps restrain stale taxation. 
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TABLE I 

Uni lama it Siaiisiks 



Real l ax Kale 

Real Price 

Quantity 

Observations 

1,223 

1,223 

1,223 

Mean 

13.59 

29.12 

121.5 

Standard deviation 

3.14 

3.15 

22.4 

Maximum 

22.56 

40.94 

192.2 

Minimum 

5 22 

20.08 

61 9 

Median 

14 IN 

29.24 

121.8 

Interquartile range 

4 05 

3.91 

26.0 


Vm - -Keel} fuuv .iwl m-.fJ t«ix «im* in ) l M>7 imh Qti<iriiii> is m f>,»< k<igc% pn priyoi) pfi year 


the competitive paradigm can be attributed to the actions of the hand¬ 
ful of large manufacturers. In any case, the relevant r in n*(c, t) 
includes costs incurred by wholesalers and retailers. M 


Slutishinl Model 


The following multivariate analysis of covariance model was em¬ 
ployed to represent the data: 


1 111 ?« | 

l Pn I 


I \ 


h; 

! 


I*) 

... (s - J) + 

d; j 


k; 

/„ +■ 


U r ~ ) 


w 


i e 'i I 


( 5 ) 


where </„, />„, and l,„ are, respectively, per capita consumption, price, 
and tax rate in state i and year.s.s is the midpoint ol the sample period 
(1968.5), and f is the mean tax rate (13.59 in 1967 cents). 

With the identification restrictions, = 0, £, d{ — 0, and £,/? = 0 
(j = 1, 2); a 1 and a' 2 are overall intercepts; the b) and b'f are state 
effects measured as deviations from overall levels; the/' and ft are 
year effects, also measured as deviations from the overall levels; and 
the (I', and df are the coefficients of state-specific time trends mea- 


M More formally, let r 1 denote the combined level (assumed constant) of the wholesal¬ 
ers' and retailers' marginal costs. Then in the context of the conjectural variations 
model of Sec. II, the tth firm's profits will be Q/’hp — r'O, - C,((J,) — tQ„ where P(Q) 
is now retail price Now if it is assumed that Cj(Q,) - rMor all i, then the argument 
leading lo (3) shows that il/| 1 + a,(0( 2 n*(c 1 + c 2 , f). If, contrary to the assumption 
adopted here, smuggling were empirically significant, the manufacturers' production 
decisions would be dependent on cross-state demand elasticities and the simple model 
of Sec. 11 would not be adequate 
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sured as deviations from the overall time pattern given by the f] and 
ff. The linear effects are captured by g [ and g' 2 , and h 1 and hf capture 
the quadratic effects of the tax rate on quantity and price. The el, and 
e\ are an array of bivariate error terms, the statistical properties of 
which will be discussed shortly. 

Logarithms were used for quantity and levels for price and tax rate 
because this was judged to give the best fit to the data, producing, for 
example, residuals with the most nearly symmetrical distribution and 
most nearly constant variances. This formulation amounts to fitting to 
the data the functions 


9«(0 = exp{A!, + g't + h\t - if) 
and (6) 

MO = k ‘t + (ft + A 2 (/ - If, 

where k[, = a 1 + b{ + d{(s —I) +/](/ = 1,2).'* Thus (3) implies that 

~[g 2 + 2 h' 2 (t - l)] 


n*(c, 0 = 


lg' + 2 h\t - ?)P' 2 + gh + h\t - If -t-c] 


■ (7) 


It should be noted that the above specification of a functional form 
for pit) and q{t) does not make the present study vulnerable to the 
Bulow-Pfleiderer criticism. The difficulty with Sumner’s analysis is 
not in his estimates of p(t) but, rather, in his derivation of the relation¬ 
ship between p'{l) and the elasticity of demand. That relationship, as 
Bulow and Pfleiderer show, depends critically on the functional form 
of the demand curve. In the present instance, the predictions of the 
models being tested are simple comparative statics results and are 
entirely independent of functional form. Specific functional forms 
are employed only in the estimation stage. 

Of course, the exact values of the estimates obtained depend on the 
form chosen. The robustness of the results to this choice was, there¬ 
fore, investigated fairly extensively. It was found that, as long as the 
class of functions chosen was sufficiently flexible (i.e., allowed the 
functions and their first two derivatives at a given point to take on any 
set of values), the dependence was relatively slight. 


Estimation 

The parameters of (5) were estimated initially by ordinary least 
squares. However, examination of the residuals revealed a significant 
degree of serial correlation. An attempt was made to identify the 


9 Adding interactions of tax rale with various fixed effects did not significantly im¬ 
prove the fit. 
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pattern of' this correlation and to construct an appropriate two-step 
generalized least-squares estimator giving efficient parameter esti¬ 
mates and consistent estimates of their standard errors. This task was 
complicated by the fact that, even with a large number of cross-section 
units, in models with fixed effects such as (5), the usual estimators of 
the autocovariance function have nonnegligible biases when the num¬ 
ber of time-series units is small (see, e.g., Nickell 1981; Solon 1984). 
Moreover, the incidental parameter problem is worse here than usual 
because of the addition of state-specific trends. 

In order to reduce this bias (and assess its size), a version of the half- 
sample grouped jackknife estimator was employed (see Quenouille 
1949, 1956; Efron 1982). Specifically, (5) was estimated by ordinary 
least squares on the first and second halves of the data separately, and 
the two sets of residuals were each used to estimate the various auto¬ 
covariances. Letting r | and ? ■_> denote the values of an estimator on the 
first and second halves of the data, respectively, and R its value on the 
whole sample, the jackknif e estimate of bias is R - [(r, + r 2 )/2] and 
the corrected estimator is 2 R - [(r, + r 2 )/2]. As is well known, if the 
original estimator has bias of order 1 IT, where T is the number of 
time periods, the corrected estimator will have bias of order 1/T 2 . 

I he corrected estimators were used to investigate the pattern of 
correlation in the errors. It was found that they could be modeled as 
independent first-order vector autoregressive processes. (For some 
details of this investigation, see App. A.) That is, defining 



the following error structure was postulated: 

e„ = Ae„ + u„, (8) 

where the u„ are an array of independent bivariate errors each with 
mean 0 and covariance matrix 2 and where A is a 2 x 2 matrix. 

The procedure used here for estimating (5) and (8) is essentially 
that of method two of Guilkey and Schmidt (1973). 10 Jackknife- 
corrected estimates of A and 2 were computed from least-squares 
estimates of the e„ in (5). These were then used to transform the data 
so that the assumptions underlying least-squares estimation tech¬ 
niques were approximately satisfied. (The appropriate generalized 
least-squares transformation matrices are shown in App. B.) The 
transformed data were used to reestimate (5), and the improved esti- 

10 First-year data were not simply dropped here as they were in the Monte Carlo 
work of Guilkey and Schmidt (1973). 
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mates were then substituted into (7) to obtain estimates of n*(c , (). 
Approximate confidence intervals for n*(c, t) were obtained from the 
covariance matrix of the estimators using the delta method. 11 

IV. Results 

Final estimates of the slopes and overall intercepts of 
form (5) are given in table 2 along with the estimates 
covariance structure (8). (Numbers in parentheses are 
rors.) 

As can be seen from the table, the estimates of the two 
the expected signs. In terms of q(t), the estimates for In q(t ) corre¬ 
spond to <7(0) = 163.3, q'(t) = -2.93, and q"(i) = 0.041. In particu¬ 
lar, the estimate of p'(l) is greater than one by a statistically significant 
amount. Thus the data allow for the rejection of perfect competition 
with constant industry marginal costs. 12 

Table 3 presents point estimates and approximate 95 percent 
confidence intervals for overall n*(c, i ) for values of c ranging from 
zero to 12 (in 1967 cents). These were computed by replacing k„ in (7) 
by the overall intercept, a 2 , and substituting the sample mean of the 
tax rates (13.59 in 1967 cents) for t. 

As can be seen, the estimates of n*(c, l) support the view that there 
is at least a moderate amount of competition in the cigarette industry. 
The 95 percent confidence interval for n*(0, l) lies entirely above 2.5. 
Thus even if the marginal cost of producing a package of cigarettes 
were zero, the behavior of the industry price and output functions 
could not be rationalized by the equivalent of any less than 2‘/a firms 
playing a quantity Cournot game. In particular, the data allow for the 
rejection of the hypothesis of a perfect cartel since that hypothesis 
would imply that n*(0, l) was less than or equal to unity. If it can be 
assumed that true marginal costs are higher than zero, then stronger 
statements can be made. For example, if it can be assumed that mar¬ 
ginal costs are higher than 6 cents (in 1967 cents), then any hypothesis 
of less than four firm equivalents can be rejected, and if it can be 
assumed that marginal costs are higher than 12 cents (in 1967 cents), 



11 That is, it was assumed that the distribution was approximately t.ausstan and that 
var[n*(c, f)J » dCd r , where C is the covariance matrix of the basic parameters and d is 
the row vector of partial derivatives of ri*(c, t) with respect to the parameters evaluated 
at their point estimates. 

12 The estimate of p’(l) is slightly higher than those reported by Sumner tor the same 
quantity. His estimates ranged from 1.029 to 1.074. The discrepancy seems to result 
more from dif ferences in estimation technique than from differences in data sets; least- 
squares estimation of (5) gave results for g 2 closer to Sumner's. 
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TABLE 2 

Reucced-Form Estimates 



Intercept 


t 

(1 - if 

In(Quantity) 

5 119 


- 0245 

- 00013 


(.016) 


(.0012) 

(.00018) 

Prirc 

14.24 


1.089 

.0090 


(.34) 


(.026) 

(.0035) 

/ 

' 812 -.000 \ 

( .713 

— 611 \ 


A = 

h = 


x II) ' 


\ 

1.058 512 / 

( -.611 

342 6 / 



1 ABLE 3 
Overall n*(t. /) 




95 Perckni Conhiie.vce Interval 

< (in 1967 C'.cnls) 

Point En i imait 

Lower Bound l 

Jpper Bound 

0 

2.88 

2 57 

3.18 

1 

3 08 

2 75 

3.40 

2 

3 30 

2 95 

3.65 

3 

3.57 

3.19 

3.95 

4 

3 88 

3 47 

4.29 

5 

4 23 

3.80 

4.70 

6 

4 70 

4 21 

5 20 

7 

5 26 

4.70 

5.82 

H 

5.96 

5.34 

6.59 

*> 

6 89 

6 16 

7 02 

10 

8 15 

7 29 

9.01 

11 

‘1 99 

8.93 

1 1 04 

12 

12 88 

1 1.52 

H 24 


(lien any hypothesis of less tfian the equivalent of 11 firms can be 
rejected. 

Table 4 displays the dependence of n*(r, t) on t by tabulating esti¬ 
mates as I ranges from 2 standard deviations below to 2 standard 
deviations above its sample mean. This dependence evidently in¬ 
creases with r. For c = 0, it is not very great, with all estimates lying 
between 2.8 and 2.9, while for r = 12, it is more significant, with 
estimates ranging from 10.5 to 13.4. 

The empirical results of this paper are relatively robust to a number 
of small modifications of the statistical model and methodology. For 
example, using levels instead of logarithms for quantity produces a 95 
percent interval estimate of 2.9! ± 0.33 for n*(0, l). Other models 
based on logarithmic or power transforms of the basic variables also 
give very similar results. Dropping the quadratic terms in the tax rate 
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Overall n*(c, I ) for Various Values of t 
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k 

t = t + A(<t,/2) 


Value of r 


0 

4 

8 

12 

-4 

7.31 

2.80 

3.80 

5.89 

13.16 

— 3 

8.88 

2.83 

3.84 

5.97 

13.36 

-2 

10.45 

2.86 

3.87 

6.01 

13.38 

- 1 

12.02 

2.87 

3.89 

6.01 

13.22 

0 

13.59 

2.88 

3.88 

5.96 

12.88 

1 

15.16 

2.87 

3.86 

5.89 

12.42 

2 

16.73 

2.86 

3.83 

5.79 

11 85 

3 

18 30 

2,84 

3.78 

5.66 

11.22 

4 

19.87 

2.81 

3.72 

5.50 

10.54 

Ntm — 

i and fail cm it, 

= standard deviation tax ia(f cent*. / ^ 

mean tax rair * 

1 3 TO t ems 


changes the interval to 2.89 ± 0.27. Using the unjaekknifed estimates 
of A and X has a larger but still fairly minor effect on the results, 
implying an interval for n*(0, l) of 2.75 ± 0.26. 1,1 At least for the 
smaller values of c, the confidence intervals given in table 3 seem to 
adequately describe the degree of uncertainty in the estimates. 


V. Conclusions 

In this paper, I showed that a certain quantity, n*(c , t), relating to the 
response of price and output to variations in the tax rate, was a lower 
bound on the numbers equivalent of firms. Estimates of this quantity 
showed that the data were not consistent with a numbers equivalent of 
less than 2 Vs; even if the industry faced zero marginal costs. Assump¬ 
tions of higher marginal costs implied higher estimates of the lower 
bound. These results were found to be relatively robust to a number 
of small changes in the statistical methodology. 

It should be noted that they also possess a degree of robustness to 
the simplifying assumption of no interstate substitution. As (4) makes 
clear, the evidence produced here against highly noncompetitive be¬ 
havior comes essentially from a finding of too little elasticity of de¬ 
mand. In treating states as independent units, this paper is using 
evidence derived from the elasticity at the state level. To the extent 
that interstate substitution is possible, this elasticity will be higher than 
the elasticity cigarette manufacturers would face if they raised prices 


A longer version of this paper, available on request from the author, reports more 
detailed results of making small changes in the statistical model Also reported are the 
estimates of the other parameters in (5) as well as state- and year-specific estimates of 
n*(c, t). 
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iri all states simultaneously, Thus, to the extent that interstate sub¬ 
stitution is empirically significant, the lower bounds given above will 
be overly conservative, but they will not be incorrect. 


Appendix A 

The statistical properties of the disturbance terms e u were investigated with 
the aid of some of the techniques for identification of multivariate time series 
described by Jenkins and Alavi (1981). Jackknife-corrected estimates of the 
covariance matrix function C(k) and the correlation matrix function R(k) 
were obtained front the residuals of (5). In addition, these were used to obtain 
estimates of the partial correlation matrix function S(A), which is, by 
definition, the coefficient matrix of e„_* in the vector autoregression of order 
k The first tew values of the estimates of R(A) and S(A) are displayed in table 
AI. 

I be sharp decline in S(A) after the first lag combined with the smoother 
decline in R(A) suggest the appropriateness of the first-order autoregressive 
model. Examination of the residuals of the final estimates gave no reason to 
doubt the adequacy of this model. 


TABLE AI 

AtllOC ORRIXAIION AND PARTIAL AUTOOORRKLAI ION 

Mai mi ks or Disturrancf. Tlkms 


RlA) 

1.000 - 047 


- 047 
81 I 
031 
or>o 
008 
.487 

- 025 


1,000 
- 031 
.507 
020 
329 
001 
. 158 


.322 - 049 

- 095 .047 

105 - 053 

- .044 -.111 

- .087 - .042 

-.001 -.159 


(lct[R(A)l 

.998 

.412 

213 

076 


S(A) 


Oil 


014 


.014 


.821 
.989 
.127 
- .962 
065 
.982 
.102 


c 


.000 
.511 
002 
.176 
.003 
.087 
003 
I 562 069 

094 .000 

.673 .001 

-.239 .000 

.186 .068 


/ .102 -,003\ 

V - 1 562 069/ 


det[S(A)| 


.504 

.024 

009 

.011 

.001 


6 


-.016 
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Appendix B 

Writing model (5) in the form 

y u = x„P + e„, (Bl) 

the appropriate GLS transformation is the quasi first difference 
Riy u = Rix„P + Rie u , s = s„ 

and (B2) 

Ray,. = Rjx.iP + R 2 «„. s > s„ 

where s, is the first year for which data are available for state j, R| = \ R‘> 

= I - A, and © is a solution of the equations 0 = A0A 7 . 

In the actual estimation, S(l) in table A1 took, the place of A, and a jack¬ 
knifed version of the standard estimator based on the residuals of (8) the 
place of X. 
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Monitoring and Hierarchies: 

The Marginal Value of Information 
in a Principal-Agent Model 


Nirvikar Singh 

University of California, Santa Cruz 


This paper considers imperfect information about the agent's effort 
in a principal-agent relationship. It is shown that, under some 
smoothness conditions, the marginal value of such information, at 
the point of no information, is zero. This result is used to analyze the 
optimal level of monitoring in an organization and the nature of 
hierarchical organizations. 


I. Introduction 

In a principal-agent model with pure moral hazard, the agent takes a 
costly action that has an uncertain outcome. If the principal observes 
the outcome hut not the action, he can design a payment rule for the 
agent, based on the outcome, that provides the latter with appropriate 
incentives to act. If the two individuals have different attitudes toward 
risk and the agent has a reservation level of utility for accepting the 
“contract,” there is a conflict between incentive provision and risk 
sharing, and the outcome is less efficient than if the principal could 
observe the agent's action costlessly. 

In this situation, any information about the agent's action, no mat- 


This paper is a revision and extension of Bell Laboratories Working Paper no 218. 1 
am indebted to Roy Radner for suggesting the problem and for detailed comments that 
improved the exposition and the proofs. Further detailed comments from an anony¬ 
mous referee were also invaluable. 1 also thank Bell Labs lor providing a pleasant 
summer working environment and Andreu Mas-(,'ole)l for encouragement and in¬ 
sights. Of course, none of the above are implicated in the final product: errors are 
solely mine. 
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ter how imperfect, is valuable If it was not amuuUu i-. 0 . 
outcome. This information is referred to as 
Uon." This statement of value does not. of course, consider 
acquiring information. 

r/iiv ;wpci I show that the marginal value of information, a. the 
/ i! /S/ttrtiMrtm. « *«, ifsome plausible smoothness «*£ 
point ot ^ jsmodihaiuon of that of Radon and 

"°' !S 'tf iinrtfina/ costs of gathering information are always 

Sngliu {) ptinuil level of monitoring by the princi¬ 
pal , , , urincipal-dgent situation is an employer- 

A concrete example < - f ^, employee's action is effort, and 

employee relationship. In ,l j ,sols { considered to involve costly 
monitoring by the principal hm) mw bt , . » 

,.fl„rr. In a simple organ,ration model, we show that the result above 
implies a concentration ot monitoring in a large two-level organiza¬ 
tion. This in turn suggests a rationale for a multilevel organization 
with intermediate levels serving as monitors only. We discuss the 
characterization of optimal payment rules in these cases. In a broader 
context, we provide further illumination for the reasons hierarchical 
organizations exist 

The structure of the paper is as follows. In Section II, I summarize 
the relevant principal-agent literature and discuss the nature of moni¬ 
toring. Section III provides a formal statement of the model and the 
result on the marginal value of monitoring information. Section IV 
introduces a simple model of an organization and analyzes the op¬ 
timal level of monitoring and the nature of hierarchies. Sec tion V 
contains a concluding summary and discussion of the results. 


II. Observability and the Nature of Monitoring 

The discussion in this section is informal. The details of the model are 
contained in the next section. However, I shall introduce some nota¬ 
tion here to facilitate presentation: a = agent’s action, x = output, y 
= principal’s signal about the agent’s action, and F(x, y; a, 0) = joint j 

distribution function of x, y given a, 6; F:R + X R x A X 0 —» [0, 1], 
where 0 is a real-valued parameter and f is the corresponding density. : 

Hoimstrom (1977, 1979) provides the first treatment of the case " 
where the principal observes the outcome of some random variable, y, j 
in addition tox. Hoimstrom (1979) defines a signal y to be informative 
about a if and only if 

fix, y; a) = g(x, y)h(x; a) almost everywhere 

is false. This definition is motivated by Holmstrom’s proposition 3, 
which shows that a signal is valuable to the principal (the agent receiv- 
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ing K always) if and only if it is informative. This result is' generalized 
in Gjesdal (1982), Holmstrdm (1982), and Grossman and Hart (1983). 
However, I restrict the analysis of this paper to the original Holm- 
strom model. The addition here is the parameterization, in terms of 0, 
of the information systems (i.e., the signal y and its distribution). 1 
shall next discuss the concrete interpretation of 0 and y. 

One might simply postulate that monitoring consists of the gather¬ 
ing of a report on the agent’s level of action. Alternatively, one can 
assume that monitoring consists of direct observation of the agent’s 
effort over some period of time. If the agent puts in a constant level of 
effort over a period of time, monitoring cannot be imperfect: the 
principal could achieve costless perfect monitoring by observing the 
effort level for an instant. More realistically, the agent’s effort over an 
interval of time would be a stochastic process, with output depending 
on the mean, a, of the process. Imperfect monitoring is then a sam¬ 
pling procedure. On the other hand, if there is some probability of 
making a correct observation of the effort of the agent, failing which 
nothing is learned, and if this probability is an increasing function of 
care of monitoring, 0, then it is possible to construct a random vari¬ 
able y = p{Q)a + [1 - />(0)]e, where e is white noise and p' > 0. 

It is also possible, since y is an artificial construct summarizing the 
monitoring process, to replace p(B) by 0. Hence, here 0 can vary con¬ 
tinuously, andy is well defined for 0 = 0, the case of no information. 
These properties will be useful in the next section. Finally, note that 
conditional monitoring (e.g., Holmstrom 1979) could be modeled 
similarly. 

III. The Marginal Value of Information 

In this section, I examine the conditions under which the marginal 
value of information is zero at the point of no information, in the 
basic principal-agent model. This result is a modification of the result 
of Radner and Stiglitz (1984). Unlike them, I assume differentiability 
and hence use the calculus. I also discuss when differentiability holds. 
In the light of the discussion of monitoring in the previous section, I 
assume that 0 is a real-valued parameter indexing the family of avail¬ 
able information structures. I assume 0 £ © = [0, 1] and that 0 = 0 
represents a nonin formative structure. 

Radner and Stiglitz consider a decision maker choosing an action to 
maximize expected utility subject to some constraints. The crucial 
assumptions are that (i) the information structure is differentiable in 0 
at 0 = 0, in a suitable sense, and (ii) there is a family of optimal 
decision rules that is continuous in 0 and flat at 0 = 0; that is, it results 
in the same decision for all y. Radner and Stiglitz also made the 
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assumption that the decision maker’s constraint set is nonincreasing 
in 0. This incorporates direct costs or restrictions imposed by the 
gathering of better information and allows them to conclude that 
the marginal gross value of information is nonpositive at 0 = 0. In the 
principal-agent model, the principal chooses a sharing rule but is 
constrained by the agent’s utility-maximizing choice of action, which 
depends on the sharing rule. T here is no guarantee that the princi¬ 
pal’s constraint set will be nonincreasing in 0. Hence we cannot apply 
the Radner-Stiglitz result directly. However, to show that the mar¬ 
ginal gross value of information is zero (rather than nonpositive) at 0 
= 0, we need only that the choices of the principal and agent be 
sufficiently smooth functions of 0. The result is then obtained as a 
variant of the “envelope theorem.” 

It turns out to be difficult to guarantee that the required smooth¬ 
ness conditions hold for the general principal-agent model. While 
these are important for understanding the applicability of the result, 
they are not central to the intuition behind the result itself. There¬ 
fore, we assume first the required properties of sharing rule and the 
agent’s action. A discussion of these assumptions then follows. This 
approach also allows us to consider arbitrary decision functions for 
the agent, and not just those that satisfy expected utility maximiza¬ 
tion. 

The following notation is used: G = principal’s utility, a function of 
wealth; U = agent’s utility-of-wealth function; V — agent’s disutility- 
of-effort function; = sharing rule; and K = agent’s reservation 
utility. Note that the agent’s utility is assumed to be additively separ¬ 
able throughout. 

Given some policy «*(.', 0) for the agent, the principal solves the 
following problem: 


max 


G(x — s)dt\x, y; a, 0) 


(1) 


subject to 

I U(*)dF(x , y; a, 0) - V(a) ^ K 

and a = a*(s, 0). 

I make the following assumptions: 

Al: a £ A, a compact subset of R. 

A2: G:R —* R, U:R —* R, V:R —* R are twice continuously differ¬ 
entiable, G' > 0, G" « 0, U' > 0, U" < 0, V’ > 0, V” s 0. 

A3: x and y are observable, so that s is a function of x and y. Further¬ 
more, the choice of sharing rules is restricted to S = {s:R + x R 
—* [c, d + jc]| s isi nondecreasing}. 
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A4: F(x, y; a, 0) has a density function f(x, y; a, 0) that is twice continu¬ 
ously differentiable in a and in 0. 1 F a < 0. Furthermore, the 
marginal distribution of x is independent of 0; that is,/(x, y; a, 0) 
= g(x; a)h(y\x; a, 0), where g(-) is the marginal density of x and h(-) 
the conditional density of y given x. 

A5: h(y\x; a, 0) = /t°(y|x). 

A6: s*(x, y, 0) = s°(x), where s* is a solution to (1). 

A7: a*(s°, 0) = a 0 . 

A8: s* and a * are differentiable functions of 0. 

A1-A4 are familiar assumptions suitably modified. A5 is the for¬ 
malization of the idea that 0 = 0 is a noninformative signal—for 0 = 
0, y contains no information about a beyond that conveyed by x. A5 is 
not necessary for the proof but makes A6 and A7 plausible at the 
optimum. A6 states that the sharing rule is independent of a nonin¬ 
formative signal, that is, flat. A7 states that monitoring information 
does not matter for the agent if it does not affect his share. I shall 
discuss later the conditions under which A6-A8 hold. Finally, I as¬ 
sume that it is always possible to interchange the order of differentia¬ 
tion and integration. This and other technical assumptions are dis¬ 
cussed in Holmstrom (1977) and Stoeckenius (1979). 

1 now state and prove 

Proposition 1. Let J(Q) be the optimal value of the principal’s ex¬ 
pected utility for information structure 0, that is, the value obtained 
by solving program (1). Then, given Al-A8,y'(0) = 0. 

Proof. Let G(s, 0) = f f G[x — ,v(x, y)]f \x, y; «*(.>, 0), 0 ]dxdy and //( v, 0) 
= //(/[s(x, y)\f [x, y; a*(s, 0), Q]dxdy - T[a*(.v, 0)]. So J(0) = G[j*( 0), 0]. 
Hence}'(0) = />,G[.v*(0), 0] • ,v*'(6) + D 2 G[s*(8). 0], where D, denotes 
the gradient with respect to the ith argument. 

The optimum is characterized by 

DiC.G, 0) + X£>,/?(5, 0) = 0 (2) 

and 

Fl(s, 0) - K = 0 (3) 

if this equation has a solution c « j* *= d + x. (Otherwise I set s* = c or 
d 4- x). Furthermore, differentiating (3) with respect to 0 at the op¬ 
timal s = s*(0). 


1 This rules out the ease of the normal distribution, where 0 is the precision, or 
reciprocal of the variance. In such a case, F is not differentiable ai 8 = 0. However, if 0 
is the square root of the precision for the normal distribution, differentiability holds 
Another possible objection remains where 0 is such a function of the precision: 0 = 0 
no longer represents a propel distribution. This objection is avoided by the example in 
the text 
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DiR[s*m. ejs*'(e) + 6j . q 


Using (2) and (4), we have 

I' ( D) = \*(6)/)‘f/[s*(0)' 0) + D 2 G(s*(9), 0). 

From the M' nifiof **' 

/)/1( . 6} = [\ci>(x,m<fiX'*“* (s - e) ’ e]n - a * (s ' 0) 


(4) 

(5) 


At />v A 7, 


Bv A t. 


+ / V T«/*** 

I).,a*(F\ 6 ) = «. 


( 6 ) 


1) yf(x, y; a, 6) = ^(.r. «)/>.,A(y|r ; a, 0). 
Furthermore, for every 0 and a, fh(y\x; «, 0)rfv = 1 so that 

/> |A(.v|v; «, 0)rfyi = 0. 


(7) 

( 8 ) 

(9) 


Using (7) and (8) in (6), 

I) Jl{s \ 0) = || U(V'(x)]g(x,- a°)D^h(y\x; a °, Q)dxdy 

= j U\Ax)lg(x; «°)|| D,A(y|x; a°, 0)rfy 
= 0 




hy (9). A similar argument shows that Z) 2 G(.v°, 0) = 0. Hence, putting 0 
= 0 in (5) yieldsy'(0) = 0. Q.E.D. 

I now turn to the issue of whether A6—A8 are satisfied in the case 
where the principal is faced with an agent who maximizes expected 
utility given the sharing rule. 

It turns out (Mirrlees 1979; Grossman and Hart 1983; Rogerson 
1983) that the following two additional assumptions enable the use of 
the calculus to characterize the optimal sharing rule. 

A9: F(x, v, «, 0) is strictly convex in a, that is, F lin > 0. 

A10: The ratio fjf is increasing in x, that is,/«*/ - f/ x ** 0. 

The interpretation of A10 is in Milgrom (1981). A discussion of A9 
is in Singh (1984): it is sufficient if V" 3= 0. 

If A1-A10 are given, except at corners, s*(x, y) satisfies 

G'[x - s*(x, y)] = x * , * /»(*. y: g) 

U'[s*{x, y)] f(x, y; a*, 0) ’ 


(10) 
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where X*, p*. and a* are obtained by simultaneous solution of (10) 
and 

|| G[x - s*(x, y)]f tt (x, y; a*, 6 )dxdy 


+ p* 


U[s*(x, y )]/,<„(*, y; a*, Q)dxdy - V"(a*) 


= 0 , 


( 11 ) 


| U[s*(x, y)\f a (x, y; a9)dxdy - V'(a*) = 0, (12) 

| U[s*(x, y)]f(x, y; a*, 9)dxdy - Efa*) = K, (13) 

where 1 assume U is such that the last equality holds. Note particularly 
that the argument 8 is suppressed in s*, a*, X*, and p* in (10)—(13). It 
is rather straightforward to repeat the proof of proposition 1 f or the 
case where s*(x, y, 0), a*(0) are solutions to (10)-(13), if they are 
differentiable in 0. In particular, one obtains that these equations, 
together with A5, ensure that s*(x, y, 0) = i°(x) (A6) and D[a*(s°, 0) = 
0 (A7). The intetesting question is therefore regarding the differ¬ 
entiability of s*, a* (A8). 

Essentially, it is necessary to check whether one can apply the im¬ 
plicit function theorem to (10)—(13). The dif ficulty is that (10) holds 
for every x, while (11)—(13) are integral equations. 1 therefore con¬ 
sider (10) as defining *(x, y, X, p, a , 0). Since U" < 0, by the implicit 
function theorem, D ; p, D^s, and exist. Now I can substitute s(x, y, X, 
p, a, 0) in (11)—(13) and obtain three equations in X(0), p(0), and a(9). 
If the solution is differentiable in 0, then s(x, y, 0) also will be differ¬ 
entiable in 0. 

Hence I have reduced the problem to checking whether the deter¬ 
minant of the derivative of (11)-(13) with respect to (X, p, a) is non¬ 
zero. This has, unfortunately, not yielded to analysis or further as¬ 
sumptions. I am thus unable to demonstrate differentiability in 
general, but it seems that nondifferentiability will be “accidental.” 


IV. Organizational Structure: Hierarchies 

In this section, I first show that proposition 1 has consequences for 
the organization of production, in the context of a simple model. I 
then relate the principal-agent approach to other models ol hierarchi¬ 
cal organization, clarifying the nature of an organization in terms of 
which decisions are centralized. 

I tell the following simple story as motivation. Assume the existence 
of a single owner of some resource that is an input, together with 
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labor, in the production of some commodity. I lie price of output is 
fixed and normalized at one. Output is also subject to random fluctua¬ 
tions. There is a pool of jV potential workers. I he capitalist s problem 
is how to organize production, given that he cannot (will not) work 
himself but can expend effort on supervising workers. 1 he quantity 
of the resource is assumed not to be an effective constraint, and its 
allocation is suppressed in what follows. 

As above, 6, is a parameter that measures the informativeness of the 
signal y, of agent i’s effort As before, for example, y, = 0,a, + (1 — 
0,)c,, where e, is a random variable independent of a,. 1 adopt the 
convention suggested by this example, that 0, = 1 represents a com¬ 
pletely informative signal (y, = «,), in addition to 0, = 0 being no 
information. 

I make the following simplifying assumptions. None of the 
simplifications is essential; they merely highlight the point 1 wish to 
make. 

Bl: x, - x(a„ z,), where the ;,’s are i.i.cl. random variables. 

112: Total output is X = 2"^ i x,, where n is the number of workers 
employed. 

B!l: Individual output can he obseived perfectly and costlessly; indi¬ 
vidual effott can he monitored only at a cost, 
il l: All workers are identical. 

B5: The capitalist’s utility function is simply xv - (,'(r), where w is his 
share of total output (a random variable) and e is his total moni¬ 
toring effort. 

Lib: Monitoring effort is given by e = 2,"L i 0„ where m is the number 
of agents monitored, m « ti. 

117: The typical worker’s opportunity cost of working for this capi¬ 
talist is K, independent of n. 

B8: The monitoring cost function, is strictly increasing and con¬ 
vex; (".'((l) = (I, exists.' 

These assumptions ensure that 6, = 0for; = 1, . . . , m, so that e = 
m0. For given n, m, and 0, the capitalist's problem is to design a 
sharing rule for each worker. It is obvious that there will be no benefit 
in this framework from basing one worker’s payoff on another’s out¬ 
put. Thus, for each worker this reduces to the standard principal- 
agent model. I assume that the solution is differentiable in 0, so that 
proposition 1 may be applied. 


1 This rules out, e g., C(0) = e0‘ ! , where C'(0) = 0 f tence, if 0 is the square root of the 
precision, a cost function that is a linear or quadratic function of the precision is ruled 
out by this assumption. Whether this is reasonable depends on the information- 
gaihering technology, e g,, how sample size relates to precision and cost relates to 
sample size. 
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I shall now show that the capitalist will choose to monitor only a 
subset of the potential workers, provided that the pool N is large 
enough. 

Proposition 2. If m* is the optimal number of workers monitored, 
for N sufficiently large, m* is independent of N. 

Proof. The capitalist’s objective function is m/(0) - G'(wi6) + (n - 
m)J( 0). Let A(0) = y(0) — 7(0) and ir(m, 0) = mA(0) - C(m6). Then A(0) 
= 0, by definition, and A'(0) = 0 by proposition 1. Furthermore, A is 
nondecreasing. Also, DiTt(m, 0) = A(8) - 8 C'(md), and D<±Tt(m, 0) = 
wA'(0) — wiC'(m0). Hence D>.)'tr(m, 8) s 0 O A'(0) — C'(rn0) 3* 0. Let 
0(wt) = {8:0 £ [0, 1], D 2 ir(»n, 0) 3= ()}. Then, by convexity of C, &(m) is 
decreasing in m. This together with A'(0) = 0, C( 0) > 0, implies that 
there exists 0<> > 0 such that A'(0) - C’(md) < 0 for all m 3* 1 , 0 =s 0 < 
0 () . Hence the optimal 0 = 0* 3= 0„. Let Af(0) = {ni:D t -n(m, 0) 3= ()}. Then 
there exists m ( 1 such that 0 s* 0 f) :^> A/(0) C [0, w 0 J. Hence, for N 
sufficiently large, m* is independent of N. Q.E.D. 

The result above covers two possibilities. Ify(0) < 0, m* = n* < N. 
If /(0) 30 , 71 * — N > m*. In the second case, the result of proposition 
1 is crucial. Intuitively, there are initially increasing returns to infor¬ 
mation gathering, and these provide the rationale for a particular 
organization of production. One may interpret the m* workers who 
are being monitored as being organized in a factory, and the other N 
— m* as working under a putting-out system. There are other reasons 
for the use of a factory mode of production, of course, for example, 
the assembly line, which might be modeled as a production function 
of the form x = min,{£,(«„ z)]. In such a case, the agent's shares would 
depend on the observation of others’ performance, in terms of either 

or a,. I conjecture, however, that the results of propositions 1 and 2 
will hold in these situations. A detailed discussion of issues relating to 
those I have touched on is that of Marglin (1974). I have thus pro¬ 
vided a formal, “neoclassical” justification for one of his arguments. 

There remains one question about the model above. Since moni¬ 
toring has positive value, will it pay the principal to remove workers 
from production and assign them to information-gathering tasks 
only? One may think of these agents as foremen or managers. This 
possibility introduces another rung on the incentive ladder, and I 
briefly discuss this. 

Williamson (1967), Mirrlees (1976), and Calvo and Wellisz (1978) 
have examined questions of hierarchical structure. Mirrlees examines 
the issue of an inspection hierarchy where each level determines the 
monitoring intensity as well as the payment rule for the level just 
below. An alternative assumption is that the top level (i.e., the princi¬ 
pal) determines payment rules for all lower levels, as in Calvo and 
Wellisz. This is a polar case to Mirrlees’s model. My model is com- 
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plementary to that of Calvo and Wellisz, in that I derive behaviorally 
what they assume, namely, the optimal level of 6. 

Assuming that J( 0) ^ 0, so that n* = N, it is possible to examine the 
possibility of making some of these N workers into foremen as a way 
of increasing ef ficiency through better monitoring. The formal model 
is a straightforward, if messy, extension of the two-level case, and I 
omit it here. If At is the optimal number of foremen, then k > 0, if the 
foremen are reasonably productive in monitoring. The principal 
monitors the foremen, whose payment depends on a signal of their 
ef fort and on workers’ output. Importantly, foremen have no incen¬ 
tive to lie to the principal. Thus a rationale for decentralization of 
payment rules must be sought elsewhere. The workers’ shares not 
only reflect how the f oremen’s decisions affect the “strength of infer¬ 
ence," fjf, by the principal, but also involve/',,, directly, indicating that 
second-order inferences are relevant in this case. This suggests, there¬ 
fore, that an economic interpretation of the variation of f„ a with x and 
y, along the lines of A10, is necessary. 

Finally, assuming differentiability of the solution with respect to 6, 
proposition 1 should apply' to the three-level hierarchy as well. In that 
case, it may not be optimal to monitor all k foremen, and one or more 
additional levels may be optimal. 


V. Conclusion 

In this paper, I have examined the nature of monitoring in a princi¬ 
pal-agent model and its implications for organizational structure. I 
showed that, under some smoothness assumptions, there are initially 
increasing returns to information, and I explored the implications of 
this result for the hierarchical structure of an organization. In partic¬ 
ular, I clarified some issues of decentralization: the distinction be¬ 
tween a “factory” and “putting-out” system and the degree of decen¬ 
tralization of payment decisions, I focused on the determination of 
optimal monitoring and of optimal payment rules in a general 
framework. Specialization of assumptions, and hence more specific 
conclusions, remain a topic for future research. 
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The Leontief Paradox: Continued or Resolved? 

Francois R. Casas 
n/ Tmontti 


F. Kwati Choi 

L’ntvautx oj Mtwtittn — l.ulumhiu 


I 

Aii important corollary of the Hccksc her-Ohlin-Vanek model of in¬ 
ternational trade is that under balanced trade a country will be a net 
exporter ol the services of its abundant factors and a net importer of 
the set vices of its scarce factors, where abundance and scarcity are 
defined in terms of a factor-price-weighted average of all resources. 
Thus, it A', and K m denote the amounts of capital services embodied in 
exports and imports, respectively, and l. x and L,„ the corresponding 
amounts of labor services, then K x > K,„ and L x < L,„ if capital is 
abundant and labor is scarce, implying that KJL X > KJL m . 

However, even under balanced trade conditions it is not possible to 
infer relative resource abundance from the observed factor intensity 
ranking of traded goods when two factors are exported or imported 
simultaneously, as was the case for the 1947 U.S. data used by Leon¬ 
tief (1954) in Ins famous study. Those data revealed that the United 
States had been a net exporter of the services of both capital and 
labor, and Learner (1980) has clearly demonstrated that such a situa¬ 
tion is compatible with either ordering of capital and labor abun¬ 
dance. However, Learner was able to establish conclusively that Leon¬ 
tief’s data together with information on the supplies of capital and 
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labor implied that the United States had been revealed to be abun¬ 
dant in capital compared to labor. 

Learner’s argument did not address the question of whether the 
positive net exports of labor services by the United States could be 
taken as an indication of labor abundance relative to all resources on 
the average. Had U.S. foreign trade been balanced in 1947 the an¬ 
swer to this question would be unambiguously positive, but as we 
demonstrate in the next section it is not possible to uniquely link 
resource abundance or scarcity with the sign of the net export flow of 
factor services in the presence of a trade surplus or deficit. One solu¬ 
tion to this difficulty is to compute the factor services embodied in the 
net commodity exports that would have been observed under bal¬ 
anced trade conditions. Such calculations are made difficult by the 
fact that the commodity composition of a country’s trade would neces¬ 
sarily be different under those conditions, but the Heckscher-Ohlin- 
Vanek model implies a very special and predictable change in this 
composition. These calculations are shown in Section III, and thev 
reveal that the United States would have been a net importer of labor 
services under balanced trade, a result that sharply contrasts with 
Brecher and Choudhri’s (1982) suggestion that trade balance would 
have left the United States a net exporter of labor services. The note 
concludes by indicating that another paradox outlined by Brecher 
and Choudhri remains unsolved. 

II 

Consider a model with m commodities and n factors, internationally 
identical and linearly homogeneous production functions, interna¬ 
tionally identical and homothetic consumer pref erences, complete in¬ 
ternational factor price equalization, and perfectly competitive prod¬ 
uct and factor markets. With Y , and denoting the domestic and 
world income levels, let p, = YJY W measure country i’s relative share of 
world income. 

While under balanced trade aggregate domestic absorption equals 
income, the two differ in the presence of a surplus or deficit. Since 
world absorption and income are identically equal, the relative share 
of country i in world absorption, a„ is given by 



where C, and C, r are the levels of absorption in country i and the 
world. 

In a rnultifactor world it is convenient to adopt the definition of 
factor abundance proposed by Williams (1970), which states that 
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country t is abundant (sea it e) in resource / ** L,JC, , , , if the ratio of 
its endowment of that factor. /,. to that of the world, / u exceeds (falls 
short of) the ratio of the country’s income to world income, that is, if 

J II > (<) 

' Is Blether and Ohoudhrt (1982) have shown, the amount of ser . 
ices ofhetor I embodied in country is net exports, /„ may be written 



( 2 ) 


(3) 


„,i, ,* P , impfc .to mder to..., cl trade, C, - K. country r, 
,xpCm ..I Ittc,!,, / «• Renter (negate) ,t md only ,f 

that factor is abundant (scarce); that is. /, § 0 1 1 and only if IJI U , s? B,. 
However, when trade is not balanced it is possible for a deficit coun¬ 
try. C, > >'„ ro import the services of an abundant resource or con¬ 
versely tor a surplus country. (', < J ,. to export the services of a scarce 
factor. It follows that the export of labor services by a surplus country 
does not constitute evidence of labor abundance, nor do such exports 
contradict a hypothesis of labor scarcity. 

Although the actual signs of the indirect net resource flows may not 
be an accurate indicator of factor abundance, it is possible to infer the 
latter by computing the hypothetical balanced trade Hows of factor 
services from existing trade data. The assumption of internationally 
identical and homothetic consumer preferences implies that a trade 
imbalance does not affect the commodity composition of world de¬ 
mand and, hence, commodity prices and supplies. However, an im¬ 
balance would decrease (increase) the demand for all goods and con¬ 
sequently the derived demand for all factors equiproportionately in 
country ( if that country has a trade surplus (deficit) in comparison 
with the balanced trade state. If If denotes the amount of factor I 
services embodied in domestic absorption under balanced trade and / r 
the corresponding amount embodied in actual absorption, then 


/* = 
1 r 


- £>■ 


(4) 


1 In the two-factor case this definition coincides with the familiar physical definition 
of relative factor abundance since, e.g., capital abundance would be defined as KJK„, > 
YJYw ~ {W k K, + W l L,)/(W k K u . + IV, where W, is the price of factor/. It is readily 
seen that this inequality holds if and only if KJK ui > LJL W . 
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It follows that the net amount of services of factor / that would be 
exported under balanced trade is given by 

* - '■[' - M- <5 > 

so that 

If § 0 if and only if —■ g (6) 

* t * l 

This expression reveals that a country with a trade imbalance would 
have been a net exporter of the services of factor / under balanced 
trade, and the country would therefore be revealed abundant in that 
factor, if the ratio of domestic consumption to endowment of the 
factor is smaller than the ratio of domestic absorption to income, and 
vice versa. 2 Such a test has the significant advantage of relying exclu¬ 
sively on domestic data. 

Ill 

Leontief’s (1954) data show that U.S exports in 1947 amounted to 
$16,678.4 million while imports were $6,176.7 million. His figures 
also show that with exports using 182.313 man-years and import re¬ 
placements using 170.004 man-years per million dollars, the United 
States had indirectly exported 1,990,795 man-years of labor services 
more than it had imported. Given the Travis (1964) estimate of the 
U.S. labor endowment of 47,273,526 man-years, the ratio of labor 
embodied in domestic absorption to domestic endowment is thus 
0.95788. This can be compared with the ratio of domestic absorption 
to domestic income of 0.94714 based on the Woytinsky and Woy- 
tinsky (1953) estimate of the U.S. national income of $ 198,688 million 
together with Leontief’s export and import figures. 

It follows from our earlier discussion that if U.S. trade had been 
balanced, labor services would have been imported and the country's 
labor scarcity would have been directly revealed. Specifically, equa¬ 
tion (5) yields an estimate of 536,453 man-years for the hypothetical 
balanced trade level of net imports of labor services. It may thus be 
asserted that the Leontief paradox in terms of the indirect factor 
trade version of the Heckscher-Ohlin-Vanek theory can be satisfacto- 

1 It should be observed that by increasing or decreasing the demand lor all goods and 
tor all factors equiproportionately, a trade imbalance will not alter 1,1C,, which will thus 
remain equal to IJG U , = IJY„. regardless of whether trade is balanced or not. Together 
with the definition of factor abundance, this provides an alternative derivation of eq. 
(6), which was suggested by Edward E. Learner. 
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vilv explained within the f ramework of thal theory rather than bv 
r .some of its fundamental assumptions. 

I Uomiri Prado. of relatively capita). 
m r S imifori i vpUrmenis am also be explained in terms of 
intensive l > I (/un J}} fcrms <>/ the factor multiplicity in- 
the n.idc s mp lus *' hjllMloll (5) yields an estimate of $6,426 
voked to { , , mn | services under balanced trade compared 

million for exports oi < ■ 1 * nllJ//of , with the United States a 

with then at tual value o -- . ^ ,f, c balanced trade hgu res 

net capital exporter atm - /(rir ,cten>ed by a higher capital-labor 
thus reveal U-S. exports to hi 
cat iff than its import repJaeetnen s. 


IV 

W’fii/e our ana/> sis suggests that adjusting Leontiel s (fata to eliminate 
the influence of the ttade surplus will resoltc tlw Leontiel paradox 
whether m terms of the factor intensity ranking of traded goods or in 
terms of the net export of labor services, the data also reveal another 
paradox, which cannot he explained within the framework of the 
relative factor endowment theory. 

By definition, if resource / is scarce in a country, domestic income 
per unit of that factor will be larger than the world level, and vice 
versa. However, while equation (3) shows thal such a scarce factor 
may be exported in the presence of a surplus, equation (2) indicates 
that the assumptions of the Heckscher-Ohlin-Vanek model imply that 
domestic absorption per unit of an exported resource—whether 
abundant or scarce—will be lower than the world level. 

Available evidence such as Denison’s (19(57) data suggests that both 
income and absorption per worker in the United Stales were consid¬ 
erably higher than in its major trading partners even though labor 
services were exported. This inconsistency cannot be regarded as a 
modified version of the Leontief paradox as suggested by Brecher 
and Choudhri (1982), since it does not concern the relationship be¬ 
tween factor abundance in the United States and its pattern of com¬ 
modity and/or indirect factor trade. * But this inconsistency confirms 
that Leontief’s data reveal the existence of some departures— 
international differences in technologies are a leading candidate— 
from the strict assumptions of the Heckscher-Ohlin-Vanek model. 


3 In other words, the model predicts that absorption per worker will be lower domes¬ 
tically than worldwide if labor services are exported, whether or not these exports are 
consistent with the assumed factor abundance. 
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Sterling in Online' The Devaluations oj 1931, 1949 and 1967. By Alec 

Caikncross and Bakry Ek.hengreen. 

Oxford: Basil Blackwell, 198:5. Pp. viii + 261. $29.95. 

Ilns book Ininas together Kithengreen’s work on (he devaluation of 1931 
and Catrnc loss’s studies of the subsequent devaluations. The division of labor 
in the monograph is evident: the 1931 episode is analyzed by means of an 
econometric model, while (he later devaluations are dealt with only in narra- 
live terms. The comparative analysis fills an introductory and a concluding 
chaptei. The economic history of the past half-century, which the authors 
insist they are not attempting to summarize, is confined to a single chapter. 
(This insistence is contiadicted in part by items such as a list ot diamatts 
fiei.simae toward the end of the book, which gives it a more sweeping flavor.) 
The descriptions of the devaluations are segregated into separate chapters, 
with the 1931 devaluation taking as much space as the other two combined. 

Some readers will already he (amiliar with Eichengreen's work in this area 
f rom his Princeton essay, Sterling and the Tariff , 1929-32 (1981). That mono¬ 
graph was organized around the question. Why was the general tariff ot 1932 
imposed after Btttain went off gold? Viewed as an employment-creating de¬ 
vice, the tariff was self-defeating if the exchange rate was floating freely. This 
is, of course, a conclusion of most open-economy macro models, as the per¬ 
centage of the subsequent appreciation of the currency would be equal to the 
effect of the tarifl and thereby would just offset its effects in relative prite 
terms. Eichengteen shows convincingly, using quotations, that employment 
concerns were indeed one of the reasons for going off gold, but crucially 
important co the decision to impose the tariff was the fact that it would 
strengthen sterling. Clearly this new interpretation casts all actors in this 
drama in a more favorable light. 

Similaily in the volume under review, Eichengieen poses a question 
around whic h his extensive discussion revolves: Was the devaluation in 1931 
caused by internal events, sue h as monetary expansion, or was it instigated by 
the scramble lor liquidity on the Continent? Readers will probably not be so 
convinced by the tespouse to this question as they were in the case of the 
tarilf. Here the analysis is based on econometric evidence, and this evidence is 
very weak judged by conventional criteria. As a result the conclusions arc not 
very informative: “The simulation does not support the view that the 1931 
financial crisis is explicable in terms of the fundamental determinants of 
Britain’s balance of payments” (p. 82). In other words, the model, based 
entirely on monetary factors, does not simulate well outside the sample pe- 
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riod. Therefore it does not provide evidence that monetary factors were 
central to the devaluation. 

There is also econometric evidence in the appendix to this chapter. Again it 
is not very convincing, being in some ways in contradiction to the monetary 
approach in the text. A better procedure here probably would have been to 
present more raw data, especially on interest rates and monetary aggregates. 
Then the reader could judge for himself, for example, whether domestic and 
foreign interest rates appear to be equal, thereby justifying the assumption in 
the model that the exchange rate was not expected to change. 

Eichengreen’s econometric work is consistent with the monetary approach 
to the balance of payments. But the thrust of the rest of the volume is that 
such an approach is hopelessly naive at best, and perhaps nearly tangential to 
the central mechanism lying behind a devaluation. In this regard, it is worth 
noting that the inflation of wages and prices, which such an approach pre¬ 
dicts, is not apparent after any of these devaluations. Rather than relative 
prices being a major concern, it is short-term capital that is volatile and has a 
major role to play. The reason, of course, is that devaluations do not happen 
unexpectedly and from positions of full equilibrium, as the models from tfie 
Chicago school suggest. Instead, devaluations are resisted at all costs by gov¬ 
ernments. Indeed, there are no decisions to devalue, the summary chapter 
concludes, only earlier decisions not to devalue, which are subsequently over¬ 
taken by events. 

It is Cairncross’s fascination with these events that is striking about his 
portion of the volume. A devaluation is seen as just one more ol these events 
whose causes and consequences are not easily identified. "The war in Korea 
began nine months after devaluation. ... By the time the storm died down 
the devaluation of sterling was a distant, almost forgotten, event and its ef¬ 
fects hard to trace with certainty” (p. 151). Consistent with ibis is the view that 
there is little to be learned from a comparative analysis: “It is scarcely possible 
to exaggerate the extent to which our three devaluations of sterling differed 
from one another ... it sometimes seems as if the only thing these devalua¬ 
tions had in common was their coincidence on each occasion with an eclipse of 
the moon" (p. 218). 

If it is difficult to draw out any general lessons, perhaps there are two 
reasons for this. First, part of the reader's problem in assessing Cairncross’s 
portion of the volume is that small hints throughout indicate that he was an 
active participant in economic policy-making during the entire postwar pe¬ 
riod. Yet he is not listed among the dramatis personae, nor is his role in the 
devaluations spelled out fully. These absences, reflecting the authot’s mod¬ 
esty, are unnecessarily distracting. 

An example of the uncomfortable position in which the reader at times 
finds himself can be found in Cairncross’s defense of “stop-go policies" pre¬ 
ceding the 1967 devaluation. Given the general repudiation of such policies 
subsequently, these comments at first blush appear to be motivated by the role 
Cairncross might have played in formulating them. However, a more careful 
l eading of the defense leads to a different interpretation. Namely, Cairncross 
supported stop-go because the only apparent alternative was a go-go policy. It 
would be helpful if the reader were more fully informed of the role 
Cairncross played in providing an argument for restraint during this period. 

A second reason that it is difficult to draw any lessons from the hook is the 
generally low-key? anecdotal nature of the discussion in the latter chapters. 
They argue for the relevance of policies in general, with expenditure policy 
being the most important, in a very subtle Fashion, but cumulatively in a 
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forceful way. One wails until (he final chapter to see chit point ma n 
lends In that chapter the role of fiscal policy is highlighted a»th#* mi> ' Va ‘ 
common element to the three devaluations. In each case, it was 'y 1 Tf i Cfn,ril/ ' 

of expenditure, ms that governments Ibund harel to swallow " 11 

Mrtumgh most ministers did not make (he connection between fiscal nlr 
and the exchange .ate. the authors feel that the market certainly did Tha' 
nw ithelhri the hutlgr. n.ts would hv themselves hold the new Jg* 

he,ha thes um' **.i a* •* t>'"< ■> P**V measures, but die jj 

1 , to demonstrate the governments reso/uteness and or 

S«» J m «>td""'• ,ssl " < ' f,,c ' m;,rto0/ lbe ■" lthor "y' s rf) mm«ment lfcp ' 

cx , smig p.;n(i (P- “jjJJ h lhe l( . M starts out with a monetary approach 
in essemc. • b . tnd(IIJW | favor of an expenditure view in later p or . 
dev.diution. tt ■ , ;| , , „, lf iir.on and has recently been revived m 

tmiis Site!» a 'if • f ( ,l "Mmnesota-stvle thinking on the budget, 

' d.,r\"L, ! ge tares" .Dontbuwh l«tt. p. *)• The profession has 
bn;, quirk to emplov m.cIi dunking ... assessing the rontnbution the very 
expansionary Reagan budget makes in the recent overvaluation o/ the C.S 
dollar Curiously, in the pie.senr view, loose fiscal policy appreciates the cur- 
rem v al leasi in ihe siioil mil. 

In tains of die now-popular nomenclature. Cairiicross and Kidiengreen 
see fiscal concerns as playing the active police role; as a result, monetary 
policy has a much diminished pari in any financial regime. In coming to this 
conclusion, tins volume is of mote immediate relevance than one expects in a 
monograph m economic history. 


I 'mvenily of Western O,llano 


Kossfi.l S. Bovek 
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Surely not IBM, as this allusion to punched cards seems to imply. Thirteen 
years after filing of the case, and following 6 years of uncompleted trial, IBM 
escaped scot-free—not because a court ruled in its favor, but because William 
Baxter, the incoming antitrust chief, decided over the objections of the trial 
staff and the judge that the government’s position was “without merit.” He 
did not even attempt to obtain procompetitive concessions from IBM in re¬ 
turn for this capitulation, as he did in the AT &r T case.' 2 Legally, IBM’s 
victory was final and complete. If anything was "folded, spindled, and 
mutilated" it was justice or, more precisely, the administration of the antitrust 
laws. 

The peculiar way in which IBM won the case is relevant not only to the title 
but also to the intent of this book. Contrary to what Carl Kaysen maintains in 
his foreword, the book (referred to as Folded hereaher) hardly qualifies as an 
industry study, nor do the authors present it as such. To see how an 
econometrician as distinguished as Franklin Fisher would approach an indus¬ 
try study we need only look at his excellent work on the petroleum industry 
(Fisher 1964). Little or no econometrics can be found in Folded.' Its true 
purpose is stated fairly explicitly on pages 1-2: it is to provide ex post eco- 
notnit justification for Baxter’s dismissal ol the government case. 4 

Having served as the principal economic witness for IBM during the trial, 
Fisher did not need much preparation for the present task. Much of Folded is 
taken up by a restatement of IBM’s answers to the government’s complaints 
and of IBM’s own economic arguments. Even though the lawsuit is no longer 
alive, Fisher and his coauthors remain in an adversary mood; they continue to 
treat IBM’s position with uncritical respect and the government’s position 
with a mixture of exasperation and contempt. Particularly noticeable is the 
steady drumfire directed at Alan McAdams, Fisher’s counterpart on the gov¬ 
ernment side. McAdams, who was cross-examined bv IBM’s attorneys for 
some 3 months running, is cited innumerable limes, hardly ever favorably. 
The uninformed reader is led to infer that Mt Adams was the evil spirit 
behind the government’s assault on its blameless victim, yet he was not even 
involved in the filing ol the case. McAdams’s real sin, in IBM's eyes, was 
probably that his management skills were decisive in advancing the case to the 
trial stage, thereby frustrating IBM’s delaying tactics (about which more 
later). 


further investigation. V S v IHM was filed in January 1969. on the Iasi working day of 
die |ohnson administration I 11 the fall of 1971 Richard MacLaren, then head of the 
Antitrust Division, asked me to help draft a pioposaf for selllemcm of the case Macl.a- 
reu left the Justice Department shortly afterward. His at-ung successor did not present 
the proposal to IBM. as had been intended, bul adopted it instead as the division's 
tentative position cm rebel My status in ihc division changed accordingly from consul- 
tarn to expert witness. I also continued lo work toward settlement bul ultimately with¬ 
out success. In the summer of 1977 an internal disagreement as 10 (he inning of my 
court testimony led to my withdrawal from the ease. 

2 The government was not about 10 lose the IBM suit; on the contrary, by its efforts 
to have Judge F-delstein removed from the case IBM indicated a dim assessment of its 
prospects, at least in the trial court. IBM had earlier been prepared for substantial 
concessions toward settlement. 

4 Those looking lor a more conventional industry study will note the absence of data 
on the market shares and prohls of IBM's competitors and the neglect of academic 
writings on the industry. 

4 The "Stipulation of Dismissal,” reprinted on pp. 368-69 of Folded, gave no reasons 
(or Baxter’s conclusion that the case was without merit. 
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Apart (torn this lack of balance, Folded is clearly written, closely reasnn ^ 
and copiotislv footnoted. It starts out by deploring the obstacles 
learned Hands judgment in .A. v. Alcoa F945) to the expansion ofsSJ 
u , r. uniting market share rather than performance the central fo, 

monopoh mmH /ftr Ihis criticism has merit in itse/f, but jj 

‘ '"n o he wue.ttmot's original wmplatm agamat IBM. Judge Hand 

irrelevant ">k tronoim( pnliry; he merely interpreted the intern of 
did 11()t t, v 0 - ,• ,ijt-(jiioted words, “did not condemn ‘bad trusts'and 

condone good’ on«: "^‘ Mren Aim, and the fifing of U.S. v //fAf 
In the quartet oi■• , pjl , se d up numerous opportunities to mod- 

Congress and the Sop j( j )-K | Bronte the law of the land. Until the 

d v AntlIr „, ( Division considered itself obliged to enforce 

tinging apparent uola.ors to trial. T hat is precisely what the 
division did in the IBM case. As Folded dem oust rates. IBM had its defenses, 
whose merits would no doubt have been determined by the Supreme Court if 


die case had iuii its normal course. ’ 

One of the consequences of the Alum dex trine and especially ot Judge 
Hand's unduly prec ise opinions as to what market share constitutes a monop¬ 
oly—is a battle royal over market definition. The defendant naturally goes 
for a broad definition, (he plaintiff for a narrow one. In the IBM case both 
sides sti etcbed then definition beyond the limits of credibility. Fisher came up 
wnh a mythical “data processing industry,” which included not only all sorts 
of hardwaie. software, and services but also, in the case of the leasing com¬ 
panies, some of IBM's own customers. In that market IBM's share was alleged 
to be about 30 pet cent, though it is significant that Folded does not tell us 
' ■ ‘ " shares of the big players (if any) in the other 70 percent. 

* ' • ' with Judge Hand’s 67 percent criterion firmly in mind, 

pioposed a definition designed to exceed that figure by excluding certain 
rathei obvious co/npetitors. Needless to say, the witnesses produced by the 
government to justify that definition were no more persuasive than their 
counterparts on the IBM side. IBM's power to set prices and exclude compet¬ 
itors, the key issue in the lawsuit, was not greatly affected by these competing 
claims, ori either definition IBM was an order of magnitude larger than the 
next largest firm, folded is right in concluding that the excessive emphasis on 
matket share in monopoly cases has diverted attention from more important 
questions. ' 

Without access to the trial record it is not possible to review most of the 
other issues on which IBM's position is presented in the book. 1 shall confine 
myself to three: predatory pricing, intent to monopolize, and the relation 
between lawyers and economists. 

Predatory pricing was one of the conduct issues in the case. The govern¬ 
ment alleged that in the 1960s IBM used “fighting machines," computer 
systems that had been developed, announced, and priced not so much to 
satisfy c ustomers needs as to discourage certain competitors from staying in 


? A few “ f these T ,esllom were addressed by lower courts in the private antitrust 
suits brought against IBM, most of which deal with peripheral issues (in both senses of 
the word). Folded, incidentally, exaggerates IBM’s success in these cases by neglecting to 
mention that the most important one, with Control Data as the plaintiff, was settled in 
1973 at a cost to IBM estimated at over $100 million. 
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or entering the market. According to this allegation the company did not 
expect to earn anything like its norma! return on the necessary investment. In 
defense Folded essentially follows Areeda and Turner (1975) in identifying 
predatory pricing with pricing below short-term marginal cost. This 
definition, which ignores the investment aspect, is inconsistent with contem¬ 
porary theories of the firm, and in policy terms it provides too mmh latitude 
[or the discriminatory or exclusionary exercise of market power. 

To decide whether pricing is predatory two criteria should be applied: (1) 
Assuming the initial gross profit margin to be maintained during the life of 
the product, is the expected return on the relevant investment substantially 
below the return on alternative investments? (2) If so, is the prospect ol 
raising the profit margin enough to earn a normal return contingent on the 
exit of existing competitors or the nonentry of potential ones? 11 the answer 
to both questions is yes, predatory intent should be presumed. Other evi¬ 
dence of such intent, such as internal memoranda on corporate strategy, may 
reinforce this presumption. 

Folded (p. 347) dismisses the whole matter ol intent in a lew lines, and 
thereby hangs a tale. In the early stages of the suit the government made the 
usual demand for documents. As such initial demands often are, it was 
broadly formulated. Unlike most defendants, IBM chose to take the govern¬ 
ment at its word and dispatch fleets of trucks carrying millions of documents. 
The company no doubt knew the Antitrust Division's well-deserved reputa¬ 
tion for administrative incompetence and did not foresee the emergence of 
the efficient McAdams. In its eagerness to drown the division in paper, IBM 
threw in a number of documents that left little doubt of its predatory intent. 
In one of these, for instance, a corporate strategist named Hilary Haw advised 
top management how to reduce its competitors to “dying companies."'’ Once 
the oversight was discovered the company's lawyers tried desperately to re¬ 
cover the documents in question. This “inadvertent waiver" case went all the 
way to the Supreme Court, but IBM did not prevail, although it did manage 
to postpone the trial by many months in the process. 7 Alter this defeat IBM, 
and Folded in its wake, argued that evidence of intent is irrelevant. 

The concluding discussion of'the role of economics in antitrust policy en¬ 
ables the authors to recapitulate their rebuttal of the government's complaint 
in nine points. A few of their objections are valid, but once more selectivity 
undermines their argument. One central issue they ignore is IBM's attempt to 
dominate the industry by imposing its technical standards, especially for in¬ 
terconnection. Although U.S. v. IHM is now history, this issue is still of im¬ 
portance. The company’s strategy in the microcomputer market has disturb¬ 
ing parallels with its earlier approach to the market tor large systems: deferral 
of entry until the pioneers have made their mistakes, then the introduction of 
a product line that IBM and its nonafhliated admirers confidently assert will 
set new industry standards. 

The discussion also puts forward the view (p. 348, n. 9.3) that economists 
are related to lawyers as servants are to their masters. Actually the relation is 
different: as expert witnesses, economists are under oath, while lawyers nor¬ 
mally are not. This does not mean that lawyers are free to lie, but the trial 
attorney who tells the whole truth about his or her client is likely to end up in 


b I am quoting from memory. 

7 Folded, true to form, blames the delay on the government 
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the pooi'house. Expert witnesses are held to higher standards; they are ex¬ 
pected to remain reasonably objective—even when they write hooks. 

To sum up. Folded succeeds in demonstrating that IBM had a case and that 
the government case had weaknesses. It lai/s in its main purpose, which was to 
show that the government case was "without merit," the reason Baxter gave 
for abandoning the suit. If the merits of the complaint had been decided by 
the courts, the present status of the Alcoa doctrine, with which the book’s 
authors aie justifiably uncomfortable, would also have been clarified. As it is, 
antitrust policy—particularly in the area of monopoly—has become a large 
question mark. This uncertainty should concern those who believe, as 1 do, 
that enforcement of the antitrust laws is necessary for the continuing health 
of our economy. 

Hendrik S. Houthakkek 

Harvatd I'ntversity 
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Labor Supply. By Mark R. Kii.unosworth. Cambridge Surveys of Economic 
Literature. 

New York: Cambridge University Press. 1983. Pp. 493. 549 50 (doth); 117.95 
(paper). 

Mark Kiliingsworlh’s book on laboi supply is dearly written and provides a 
valuable summary of labor supply research. It is truly a review book in that it 
does not attempt to introduce new material, extend research froutiets, or 
indicate fruitful new research directions. The last four pages of the book, a 
section entitled "Where Do We Go from Here?" is in fai t a partial summary of 
some of the main research outcomes, and most of the points were made 
elsewhere in the volume. The very final paragraph ol the book conjectures 
that models incorporating risk and uncertainty and lifetime decisions are 
likely to be the concern of much luture work, but we are not given any reason 
to expect a valuable payoff to particular research outcomes or innovations. 

Lest the claim that Killingswonh has not extended our research horizons be 
taken as fatal to the value of his effort, it should be noted that the slated 
purpose of the Cambridge Surveys is to give “a clear structure and balanced 
overview of the topic written at a level intelligible to the senior under¬ 
graduate.” Fortunately for the rest ol us (and maybe even the under¬ 
graduates), the book is much belter reading for professors of economics. The 
task of summarizing the literature on labor supply is monumental, since the 
number of papers appearing on the subject has been in a state of rapid 
expansion until very recently. In a study of research in labor economics 
(Stafford 1985). I have discovered that the number of full-length papers on 
the topic of labor supply in selected major journals rose from seven during 
1965-69 to 38 in 1970-74, 53 in 1975-79, and thereafter declined somewhat 
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to 38 in 1980-83.' Perhaps this most recent slowdown in new studies allowed 
Killingsworth to complete this book! 

The point is that the study of labor supply—dehned to include such topics 
as labor supply of men, labor supply of women, family labor supply, acquisi¬ 
tion of schooling and on-the-job training (in endogenous wage models), re¬ 
tirement, and labor supply incentive effects of tax and transfer schemes_ 

underwent a dramatic growth during the 1970s. If we broaden the definition 
to include endogenous wage life-cycle models, the figures on number ol 
articles for the above-mentioned time periods are 19, 65, 95, and 61, respec¬ 
tively. The book is based on a far more extensive definition of the literature, 
including papers in "minor” journals, specially journals, books, monographs, 
and unpublished papers. I 11 this light, to say that Killingsworth has done an 
excellent job of synthesizing and summarizing what is known in so vast a 
literature is a strong endorsement of his effort. Further, nonspecialists in 
other areas will benefit from reading the hook to find out what has taken 
place, and specialists in the area will gain some perspective on where individ¬ 
ual efforts fit. 

T he book has three major parts. The first major pari develops the basic 
static labor supply model and extensions thereof, including family labor sup¬ 
ply, time allocation, and brief mention of disequilibrium F.xposition of the 
theoretical/econometric model requires 21 equations, and a total of 33 empir¬ 
ical studies are found to merit review. The main features of the studies are 
summarized in five tables, and these include sample characteristics, variable 
definitions, and labor supply elasticities. The second major part of the book 
develops an assessment of the “second-generation" studies of labor supply. 
The main feature ol second-generation studies is a major expansion in atten¬ 
tion given to econometric and data-based issues, particularly sample selection 
problems and distinctions between participation decisions and hours condi¬ 
tional on participation. Exposition of the theoretical/econometric model re¬ 
quires 57 equations, and a total of 27 empirical studies are found to merit 
tevtew. T he third section of the book reviews the development of intertem¬ 
poral and dynamic labor supply models, including some models of risk and 
uncertainty. These models are more palatable in the sense that they are more 
complete representations of labor supply choices, but manipulation of the 
basic framework to draw out inferences requires 147 equations, and. in con¬ 
trast, a meager total of 17 empirical studies are lound to merit review. 

Despite Killingsworlh’s claim that modeling dynamic labor supply is in its 
youth and will become a socially valuable adult, it seems to nte that another 
inference can be drawn across his three main topics For the study of labor 
supply either one can choose simplistic models and find lots of data that 
appear to match up with these models or one can construct more interesting 
(or, at any rate, more comprehensive) models but give up serious hope of 
implementing parametric empirical tests. If this applies to labor supply and 
perhaps to other topics in economics, then the book has a serious message: 
economics as an empirical science has to rest on very simple models; as mod¬ 
els become more comprehensive they can provide insight into the behavior of 
individuals and markets, but they have little hope of empirical implementa- 


1 The selected major |ournals were American Economic Review. Econometnca, Journal of 
Political Economy, Quarterly Journal of Economics. Review of Eionomtcs and Statistics, and 
International Economic Review. 
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. (Vorn empiricists unless the theoreticai struc- 
tion. I heonsrs will be sepa - P elements of the problem, 

uue is concise ; md manages m p to our knowledge of eco- 

Wh;it tins i«<im on a ! t f or his third topic the connec- 
noraic behavior.' Killingsworth points our rn.u ■*" e 

non between the model and evidence is at the level of stylized facts rather 
than parametric results. One of the lew efforts to fit life-cycle endogenous 
wage models was attempted by Heck man (197fi), and this study is really more 
111 a simulation mode since it could be effected only by use of quite restrictive 
assumptions and the use of synthetic cohort data (Killingsworth, pp. 327—29), 
Most of what we "know" applies to the static model and extensions thereof as 
analyzed by the methods of the first- and second-generation studies. Here we 
know that the labor supply response of women to wage and income changes is 
greater than that of men, and from the second-generation studies it appears 
that substitution and income elasticities are greater for women than for men, 
that leisure is a normal good lor both men and women, and that income- 
compensated labor supply elasticities are positive. However, there is a sub¬ 
stantial dispersion in results, and the reasons for the differences across stud¬ 
ies are not easy to determine. Killingsworth does not tell us which studies are 
most likely to be right, though he does indicate many of the limitations of the 
lesearch. 

A strength of the book is that it does set out, in reasonably consistent 
notation, sets of models. For example, the class of dynamic models with 
exogenous wage rates is set out and compared in consistent notation, and 
derived implications of the models are compared. A danger in this is that the 
reader may fail to obtain a depth of understanding of these types of models in 
terms of the how and why aspects of derivation. Reading only this volume 
would not permit one to commence new research on these areas, and instead 
one would have to track down the original references and review them care- 
hilly. Ibis is also partly because, in some cases, the book fails to represent 
some key features of the models. 

As an illustration, tonsider his discussion of cycling, which is the case where 
a person has a realized lifetime plan involving intermittent labor market 
participation and erosion of market skills with subsequent reconstruction of 
(those) skills. It is suggested at several points in the book that in dynamic labor 
supply models cycling is probably an interesting and meaningful feature. 
However, the dynamic labor supply models may suggest cycling as a mathe¬ 
matical anomaly, and quite commonly there are multiple sets of necessary 
conditions (or an optimum; some of what appear to be interesting paths are 
simply local or global minima or local maxima rather than a global maximum. 

In a world of certainty it makes little sense for one to devote a great deal of 
effort to the acquisition of market skills only to drop out lor retraining when 
skills are at a peak. Unfortunately, the models are often so messy that it is not ■. 
easy to check for sufficiency. In work by Driffill (1980), it is shown that when 
life-cycle plans involve retirement as part of optimal labor supply, then the 
trajectory characterized by cycling is dominated (in utility terms) by the path 
not involving cycling. In work by Ryder, Stafford, and Stephan (1976, pp. 
663-64), near cycling—wherein training time initially falls, then rises, and 
falls once again—is dominated by other paths. Moreover, radically different 
life-styles (one involving little training and work and perhaps early retire¬ 
ment, the other involving great amounts of training and market work and no 
retirement) can be of equal lifetime utility value. This latter finding suggests 
that lifetime labor supply may entail a choice that is dramatically influenced 
by small changes in initial conditions, just as small changes in fixed costs can 
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sharply alter labor supply in the context of the static model (see Killingsworth 
pp. 24—25). 

A fourth section of the book reviews the implications of labor supply re¬ 
search for tax and transfer programs. This section is an attempt to integrate 
public finance topics with the labor supply literature. It is a good section, and 
it provides a valuable introduction to the general subject. The effects ol 
progressive taxes and lax and benefit aspects of social security are coveted. 
The results of the negative income tax experiments are summarized briefly. 

Two important topics in this literature are discussed either insufficiently or 
not at all. An important result of the research on the effects of progressive 
taxes is that even if the overall wage elasticity of labor supply elasticity is small, 
the effects of progressive taxes on reducing labor supply and adding to dead¬ 
weight burden can be very substantial, as indicated in work by Hausman 
(1981) and Blomcjuist (1983). This important point is partly obscured since it 
is intertwined with remarks on the Laffer curve (p. 359). Another dimension 
to progressive taxes is their social insurance properties as discussed by Vanan 
(1980). Social insurance can provide incentives for expected value- 
maximizing choice of occupation, and this is not discussed in the fourth 
section either. 

An important characteristic of the volume is that it concentrates on the set 
of models used in the literature and indicates the limitations of the research 
within the confines of these models. A danger is that the overall research 
effort is substantially weakened by failure to select the right targets. For 
example, the fact that much of the work may really be identifying some form 
of supply and demand locus is mentioned only briefly; the role of a market 
wage-hours locus confounding the supply elasticity estimates, particularly for 
women, is given rather little attention. Stability of the leisure demand func¬ 
tions or household technology is given minor weight. Yet in both Japan and 
the United States there appears to have been a noticeable reduction in market 
work of adults during the period when color television and more abundant 
programming occurred (1965-80), and this trend appears to have slowed 
recently with a concomitant leveling off in television viewing time. 

The “dog that didn't bark" criticism may be less fairly directed at Kil- 
lingsworth's book, which is aimed at telling us what is out in the literature 
rather than indicating its silent points or where it could more profitably be 
directed. In short, the book is more encyclopedic than interpretive. 

Frank P. Stafford 

University of Michigan 
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A model of patenting behavior is presented in which an innovating 
firm possesses private information about profits available to competi¬ 
tors, and patent coverage may not exclude profitable imitation. A 
wide variety of predictions are obtained. Among them: a firm will 
patent only some fraction of its produced innovations, and this frac¬ 
tion is positively correlated with endogenous research and develop¬ 
ment expenditures. Numerous extensions of tire model are devel¬ 
oped, for example, the inclusion of an explicit patent race and the 
emergence of trade secrecy. 


In the empirical literature on technological change, it has long been 
asserted that while there is a strong positive correlation between ex- 
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penditures on research and development (R Sc D) and the number of 
patent applications, patent counts are a flawed measure of innovative 
activity; firms simply do not patent all innovations. Early studies by 
St here! (1965. 1967), moreover, indicate significant variation across 
industries in what Scherer calls the “propensity to patent” (the pro- 
portion of innovations actually patented). Later studies by Pakes and 
Grilithes (1980) and Scherer (1983) find similar interindustry varia¬ 
tion in patenting activity that cannot be explained by differences in 
the level of R Sc D expenditures. Pakes and Griliches also find a 
degree of randomness in the patenting activity within industries that 
again cannot be accounted for by R & D variations. The residual 
patenting behavior is usually explained with reference to either dit- 
feientes in the information revealed by patents, so that firms employ 
trade secrecy instead (see Kahn 1962; Machlup 1962), or variations in 
the capability of the patent holder to appropriate rents (see Scherer el 
al 1959; Scherer 1965). 

A telling aspect of the theoretical work in this area is that while 
much of it would predict a positive correlation between patents and R 
Sc 1) expenditures—simply because more inputs imply greater output 
(see, e g., Reinganutn 1983)—even very recent work is inconsistent 
with the basic; facts that some innovations go unpatented and that 
there is an imperfect correlation between patents and innovative ac- 
tivitv. 1 The standard model involves perfect information and total 
patent coverage. Firms therefore optimally patent all innovations, 
and patents become an exact measure of innovative activity. 

I his paper advances the stale of the theory by considering pat¬ 
enting activity by a firm operating in an environment where the pat¬ 
ent system can both transmit private information from the innovator 
to competitors and be of sufficiently limited coverage that competitors 
could, with some probability, earn positive profits through an imita¬ 
tion of the patented good or process. The central result is that the 
model genet ales an equilibrium propensity to patent between zero 
and one. From this proposition numerous testable predictions arise, 
in particular, a positive but imperfect correlation between R Sc D 
expenditures and the propensity to patent. 

In general terms the model can be thought of as a three-stage 
game. In the first stage, two identical firms engage in a “race" for 
some innovation. One firm is victorious and, as a by-product of its 
successful research activity, deduces some private information about 
the innovation. This information is assumed to concern the profits a 

1 Iatidon (1982) is a good example, 'there, 111 spice of mandatory licensing ol all 
paiented goods, it is assumed that the firm continues to patent all innovations. Rein- 
ganum (1983) discusses the rale of firm patenting activity, but this refers to the rate at 
which firms develop innovations; the innovations themselves are always patented. 
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competitor would earn should he choose to duplicate the innovation 
or engage in a related but not identical activity. The patent race 
induces a natural informational asymmetry at the end of the first 
stage, which the innovator can attempt to exploit in the next. 

While a highly stylized version of the first-stage game is presented 
in Section 11, the main focus of the paper is the second-stage “pat¬ 
enting game.” Having been victorious in the first stage, the innovating 
firm must determine whether the good or process will be patented 
(where randomization in the patent decision is possible). A patent 
prevents the competitor from producing an identical product or us¬ 
ing an identical process, but leaves open the option of his doing noth¬ 
ing, producing a differentiated product, or using a related or the 
original process. The competitor chooses among these options while 
uncertain of their profitability, but having observed the patenting 
outcome and drawn any possible inferences concerning the in¬ 
novator’s private information. In the final stage a standard duopoly 
or monopoly output game is played. 

The details of the simplest model are presented along with its equi¬ 
librium in Section I. It is shown that even if the patent process conveys 
no explicit information (i.e., information that might itself increase 
profits for the competitor), as long as the patent coverage is not so 
extensive as to rule out with certainty the possibility of a profitable 
competing product, the innovator will adopt a strategy that involves a 
unique propensity to patent that is positive but less than one. This 
result follows because patenting activity serves to reveal implicitly 
some of the innovator’s private information to the competitor; a pro¬ 
pensity to patent equal to one or zero implies that the occurrence ot a 
patent, or lack thereof, can convey no such information. 

Section II develops a number of predictions regarding the equilib¬ 
rium propensity to patent and pursues several enrichments of the 
simple model. One interesting prediction is that the propensity to 
patent will be lower the more profitable (ex ante) a competing product 
is expected to be. This somewhat counterintuitive proposition arises 
because the patent must become a more reliable signal of the 
unprofitability of the imitation, and this is accomplished by a reduc¬ 
tion in the propensity to patent. As regards the elaborations, explicit 
treatments of product and process innovations are given. Further, it is 
shown that if the innovator can engage in activities that make for 
lower profits for the competitor, such activities will diminish when the 
economic environment changes in the direction of making imitation 
more profitable for the competitor. Similarly, if patenting directly 
reveals information that raises profits for the competitor, the equilib¬ 
rium propensity to patent is reduced. This seems to be what is meant 
by “trade secrecy." Next, when the patent race is introduced (i.e.. 
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stage one) the positive correlation between the level of research ex¬ 
penditures and patenting activity emerges as an equilibrium phenom¬ 
enon, Finally, forcing innovators to reveal their private information 
would not necessarily be Pareto improving even if the rate of innova¬ 
tive activity did not respond to the possibility of such coercion. 

Section III considers relaxing some of the model’s assumptions and 
provides a discussion of the equilibrium concept utilized in the model. 

The paper closes with a brief summary. A formal proof of the main 
proposition is contained in the Appendix. 

I. A Model of Strategic Patenting Behavior 

The model presented in this paper is a three-stage game involving an 
innovation race, a patent game, and distribution of product(s). This 
section suppresses the first and last stages in order to focus on the 
patent game. Thus it is supposed that the innovation race has oc¬ 
curred, and one player has succeeded in generating a viable innova¬ 
tion; the innovator will be referred to as S , and the other (who can be 
regarded as the loser in the innovation race) as the competitor, %. A 
simple model of S 's patenting behavior and f f>’s response is devel¬ 
oped. 

I he decision S must make is whether to patent the innovation. I'he 
patent he obtains is assumed to have the following properties: (i) It is 
available to S at zero direct cost, (ii) It has limited coverage in the 
sense that while a good or process being patented completely pre- 
dudes T’s engaging in the same activity as S (or one sufficiently simi¬ 
lar to bill within the proscription specified by the patent—the patent’s 
“covet age"), there are a wide range of activities that are not ruled out 
and that compete to some extent with ^’s innovation. Thus S cannot 
have a patent on all conceivable competing goods or processes, (iii) It 
does not directly reveal any information that alters the profitability of 
the various actions % might take.’ 2 

II patents are to be viewed as information transfer mechanisms, 
information must be distributed asymmetrically. Accordingly, it is 
assumed that as a by-product of innovation S gains access to private 
information on the actual profitability of the various options open to 
€. Although aware of the existence of the innovation, % knows only 
the distribution of profits associated with any action he might take. 
For example, under the process innovation interpretation S could be 


- Ii may be that the patenting process requires sufficiently detailed information to be 
made public that production of a competing product by % is rendered more profitable 
it .$ acquires a patent. However, this possibility is excluded for the time being, and it is 
shown in Sec. Ill that reasonable relaxation of this restriction leaves the results of the 
model qualitatively undisturbed. 
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aware of the additional costs ^ would have to incur to develop an 
alternative process for producing the good in question, while % knows 
merely the distribution of such costs. 

More specifically, the options open to % are as follows. He may 
engage in one of three activities (assumed distinct): nonparticipation 
(n); duplication (d) —his preferred choice among the activities ruled 
out by a patent; and imitation ( 1 )—his preferred choice among activi¬ 
ties not ruled out by the patent. Choosing n yields % a sure payolt 
normalized to zero. For simplicity, 'Cs other activities are assumed to 
generate either high or low profits; for 1 —profits are it' = tt' or tt', 
where tt' > tt'; similarly d yields tt" = tt" or tt", where tt'' > Tt' 1 . 
It is assumed that 3 

tt', tt'' > 0 (1) 

and 

tt', it'' < 0; (2) 

that is, it is possible for either 1 or d to be profitable or not. 

Innovator #'s endowed private information comprises knowledge 
of the actual value taken on by the pair (tt', tt''); <Cs endowed informa¬ 
tion is simply the ex ante probability distribution of (tt', tt''): 1 

pr[(TT', tt") = (ff', tt")] « a''. 

pr[(ir‘, tt") = (tt', tt")] = a', 

pr[(ir', tt") = (tt', tt")] = a", 

and 

pr[(Tr‘, tt") = (tt', tt")] » a' = 0. 

Restricting the stochastic structure so that at least one activity is always 
profitable merely eliminates a good deal of algebra and is easily re¬ 
laxed (see Sec. Ill). The expectations operator under the ex ante 
probability distribution will be denoted /•.(•). For example, E(it') = (a' 1 
+ a')Tt' + a "it'. 

The analysis requires some restrictions on the payoffs to the differ¬ 
ent activities. Since ^ and $ are assumed risk neutral, these restric¬ 
tions may be stated in terms of expected profit. 

Focusing on ‘Cs profits, the first restriction is that 1 is a viable activ ¬ 
ity. That is, in the absence of any new information, i is expected to be 
profitable: 

1 The profit levels lor both agents are determined in the final stage ol the game. All 
assumptions on theSe profit levels are consistent with explicit treatment ol that stage. 

1 The mnemonics are h = “high profit,” 1 = "1 is high profit,” d - "d is high profit,” 
and l = “low profit.” 
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£( 7 r') > 0. (3) 

A second condition serves the same role for d: 

£[-rr''|(ir\ rr rf ) * <fr\ tt 1 ')] ^ 0. (4) 

Fquation (4) stales that d is expected to yield nonnegative profit even 
if it is known that the high profit state for both activities has not 
occurred, and so is obviously stronger than > 0. This additional 
strength is required to rule out an equilibrium trivially distinct from 
the one analyzed below. For more detail on this point, see Section 111. 

Finally, it is assumed that 

£(it'| ir' = tt') > £(ir' , |'ir' = it'). (5) 

Since the left-hand side is equal to fr', (5) simply states that ft' 1 is not so 
large (or rr' y so c lose to zero) that d dominates 1 even if it is known that 
it 1 = ft'. ’ If (5) fails, in the absence of new information cm it ' 7 ,1 is not a 
viable alternative to d whenever d is possible (i.e., not precluded by a 
patent). 

Given lliis framework, 3 must deride first whether to produce the 
innovation and then whether to patent the innovation if it is pro¬ 
duced. Assuming tfie decision to produce has been made, consider 3' s 
patenting problem. Since 3 may want to randomize between pat¬ 
enting and not patenting, his patent decision is best summarized by a 
triple of probabilities A = (ft 7 ', 8*. 8' 7 ). where 8\ for example, is the 
probability that a patent is obtained given that (it', tt’ 1 ) — (if', ft' 1 ). 1 ’ The 
ti iple A will be referred to as 3 ‘s patenting policy. As in all games of this 
soil, 3 is assumed to decide on a randomization pattern for all three 
possible states. In any period, obviously, lie will use only one element 
of A; tlicit is, once 3 has learned the realization of (-tt', Tt' 1 ). he will 
decide whether or not to patent given the particular realization.' 

because '4’s action determines the “market structure," the profit 3 
receives depends only on T’s chosen action, and not directly on 
whether a patent occurs. If % chooses w, 3 receives monopoly profit 
while a choice of 1 or d by % yields 3 P‘ or P' 1 . respectively. It is 
assumed that 

/"" > />' > P' 1 > <). (6) 


' A condition Mil fiat' nl for (f>) is if' > tt". the* product differentiation aigument for 
u hi< h is t om pel ling 

1 he mnemonics tot A ate the same as those in ti. 4 
' In the equilibrium analyzed below, it is supposed that $ chouses his patenting pohe v 
prior to obtaining the information on the realization of (7 t\ it 1 ') Because profits, 
given any action by X, cio not depend on (tt', tt' / ), it is straightforward to show that this 
assumption does not alter the results, the rule $ would pick knowing the realization of 
{tt', it") is the same as that presented 
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The requirement P m > max^ 1 , P d } is not controversial. Standard 
product differentiation arguments imply P‘ > P d in the product inno¬ 
vation case (although not for all possible demand specifications). In 
the process innovation interpretation, P‘ > P' 1 is also to be expected. 
There, although 3 and ^ produce the same good, the innovation is 
presumably cost reducing. It follows that so long as the innovation 
also reduces marginal cost, duplication by Vo raises total output and, 
through the associated price reduction, reduces 3's profits relative to 
what would result if <€ utilized a process with a higher associated 
marginal cost, as would happen if ^ chose i. H 

As to <€'s decision, it is assumed that % observes whether a patent 
occurs and chooses a response rule, which is a function indicating c 6\s 
response (i, n, or d) conditional on the observed patenting outcome. It 
is assumed that in forming this response rule, <4 conjectures a patent 
policy A for 3. Then % uses A to update (in the usual Bayesian man¬ 
ner) his assessment of the probabilities with which the various profit 
outcomes occur given the observed patent outcome. For example, 
given A and that a patent has occurred, ‘Us posterior probability that 
(it', n' 1 ) = (if', ft'') is ot^'/v, where v = <x h h h 4- a'S' + is the 

probability that a patent will be observed, given A. As usual, ‘Us con¬ 
jecture is required to he correct (i.e., A = A) in equilibrium. Thus, in 
equilibrium v is also the propensity to patent generated by the patenting 
policy A. Using the posterior distribution, % can calculate the ex¬ 
pected profit from any response he might make to both the occur¬ 
rence and nonoccurrence of a patent. For any given A, if p = 1 
denotes occurrence of a patent and p = 0 nonoccurrence, 'Us re¬ 
sponse rule is determined by solving for the expected profit- 
maximizing response to each value of p and will be written 

U'(A, p) = I 1 I is °p tmlal given (A, p) 
l 0 otherwise 

for; = n, /, d.'* Thus, C"( A, 1) = 1 means that nonparticipation is the 
optimal response to a patent (p = 1) given A. Note also that only one 
ofC'(A. p) can equal unity for any given (A, p) combination and that by 
definition of a patent ('/( A, 1 ) = 0 for all A. 

The equilibrium concept is “leader-follower." The innovator 3 
makes the initial patent decision and therefore is the leader. As the 
follower, r € makes a conjecture A, which he assumes is unresponsive 
to his choices, updates his probabilities as described above, and calcu¬ 
lates an optimal response rule using this updated distribution. Fur¬ 
ther, T's conjectured A must coincide with 3’ s equilibrium choice of A. 

" Again, explicit models are developed in Sec 11. 

'' It is trivial to demonstrate that randomization does not pay for 
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On the other hand, 3 supposes that for whatever A he chooses '•€ will 
respond optimally to that A (i.e., takes % s response rule as given) and 
then chooses a patent policy that maximizes his own expected profits. 
That is, given A = A, 3’s choice of A solves 

max v K-'"(A, 1 )P m + C‘( A, 1)P‘] 
a 

+ (1 - r)[C"(A, 0)/ J ’" + C (A, 0)P‘ + C d (&> O)^']. 

The equilibrium of this game is given by the following proposition. 

1’koposi i ion. (i) The optimal patenting policy for 5 is S = (S /f , S', 1 ), 
where S A and S' satisfy oc , 'S , ' + a'S' = — a^Tr'/lr'; (ii) c €'s optimal re¬ 
sponse is C"( 5, 1) = 1 = f.''(5, 0) with all others equal to zero. 

In equilibrium, then, 3 chooses a policy that always results in a 
[latent if '<•, would find d prohtahle, whereas a patent occurs with some 
probability otherwise. While the probabilities S*, S' are not unique, it 
will he seen that all of the model’s observables depend only on the 
sum a%'' +■ a'S', which is uniquely determined. It is this fact that 
allows for the formulation of a long list of predictions about observ¬ 
able entities. 

The proof of this proposition is tedious and can be found in the 
Appendix, but the basic flavor of the argument is not difficult. 

1 he behavior displayed in the proposition is the solution to 3's 
problem of choosing a patenting policy to maximize expected profits, 
taking into account ^’s optimal response to the chosen policy. Sup¬ 
pose then that 3- adopts the policy 3. Why is ^’s best response i when a 
patent does not occur and n when it does? Under 3's patenting poliry, 
a patent tan fail to arise only when (V, ir'') = (if', -a' 1 ) 

or Ctt', a' 1 ), which share the feature that imitation is profitable for %. 
Kquation (5) stales that T prefers t to d or n given that information. 
Thus i is T's best response to the absence of a patent. 

When a patent occurs, % cannot duplicate and so must choose 
between t and ti. Policy 5 is chosen so that T/s expected profit from i 
(given observation of a patent) is equal to zero, which is the same as 
the payoff to n. Thus, given 5, n is as good a response for ^ to make 
to a patent as is t. Also, and more important for the rest of the 
argument, i dominates n if 8‘ or 8* is increased even minutely (and 
conversely for decreases in 5' or 8*). This dominance occurs because 
such increases make a patent a more likely outcome when i is indeed 
profitable. 

Given ^’s actions, it is straightforward to see that 3 can do no better 
than to choose the policy 5. Recalling that the propensity to patent is v 
= oJ'V' + a'S 1 + a d , A yields 3 expected profits of vP m + (1 — P)P‘. 
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Clearly 5 would do better if v could be raised, since P"‘ > P'. But 
under the policy 2, S d is already at its upper limit of one, and a choice 
of a larger S' or b h would generate i instead of n as <€’s response to a 
patent. In that case i certainly obtains vP‘ + (1 - v)P‘ = P ‘—less 
than vP m + (1 - v)P‘ for any v. Thus 5 is optimal because it succeeds 
in generating P m as often as is possible. 10 

Substituting 2 into the expression for v, the implied equilibrium 
propensity to patent is 



It can be checked that under the assumptions made above, 0 < V < 1. 

Numerous predictions about v can be made, and the next section 
considers these and other predictions. Several enrichments of the 
model are also pursued. 

II. Predictions and Elaborations 

A. Predictions 

The theory’s immediate predictions concern the firms’ basic patenting 
and imitation/duplication behavior: not all innovations will be pat¬ 
ented; those that are not patented will be imitated; patented goods or 
processes will be neither imitated nor duplicated. This last prediction 
might strike the reader as blatantly inconsistent with reality. 

A moment’s reflection reveals that the interpretation of proposed 
counterexamples must be undertaken with some caution. The predic¬ 
tion that patented goods are never imitated relies on the assumption 
that imitation is unprofitable with some positive probability (i.e., a 1 > 
0). Cases in which the patent race generates a good that is invariably 
profitable to imitate (tt‘ < 0 is violated) must specifically be ruled out 
when testing the prediction. In such cases, transferring information 
about profitability obviously cannot deter imitation since it is known a 
priori that imitation is always profitable. The anecdotal counterexam¬ 
ples to the proposition invariably fall into this category and hence do 
not constitute a refutation. This model is not informative about those 
stellar breakthroughs that generate enormous profit opportunities 
(although it will likely be for subsequent innovations in such areas). 
While these are interesting situations, as an empirical matter they are 
exceptional and the total resources involved are comparatively minor. 
The bulk of patenting activity takes place in more mundane circum- 


10 This simple argument fails to be a proof only because it does not establish that S'* = 
1 is globally optimal. 
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stances, and it is firm behavior in such circumstances that comprises 
the data required to confront the present theory. 11 

t urning to more specific predictions, the model places a number of 
restrictions on the equilibrium propensity to patent, v. First, v de¬ 
pends positively on a d , which occurs simply because patenting is al¬ 
ways optimal when imitations are unprofitable relative to duplication, 
and a' 1 is the probability of that event. Also, since a' 1 - 1 - a' 1 - a', v 
declines with an increase in the probability that imitation is profitable 
( a. 1 ' + a‘). Therefore, as would be expected, all else constant, it is 
anticipated that for products or processes that are likely to be 
profitable to duplicate but unprofitable to imitate, there should be a 
higher propensity to patent relative to those situations for which imi¬ 
tation is likely to be profitable. 

Second, v is negatively related to both tt' and -if'. To see why, recall 
that a / '8 /l + a'h‘ is chosen to set expected profits from imitation equal 
to /eto, given a patent. When either ■fr 1 or -n' rises, expected profits 
from imitation become positive, t hus a ; '8* + ot'S' must fall so that 
observation of a patent is less likely when tt' = fr'; in other words, the 
patent is a stronger indication that tt' = it' (< 0) has occurred. 

These somewhat counterintuitive results follow from the informa¬ 
tional role of the patent. Since in equilibrium the occurrence of a 
patent serves to signal to T that imitation is likely to be unprofitable, 
any exogenous event making imitation more profitable results in a 
decrease in the patenting propensity. Such a reduction in patenting 
activity is required to tender the occurrence of a patent more compel¬ 
ling evidence that imitation is unprofitable. 

Still more predictions can be generated if the components of (tt 1 , 
tt'') are modeled explicitly. As a straightforward way to impose the 
required informational asymmetry for both the process and product 
innovation cases, suppose that having innovated, learns the addi- 


11 Note here that I he profits ol both and ^ < an be treated as expected profits, where 
any random influences are realized alter # chooses Ins patent policy", Tt' = ir' would 
then mean that 'Ts profits from imitation are drawn trom a distribution having an 
expected value lower than would be the case if it 1 = ft 1 Given this extension, the 
observation of a patented good's being imitated, and that imitation s earning negative 
profits ex post, is not inconsistent with the model. This interpretation can lie used to 
make a point concerning expected profits and firm failures. Suppose a firm is said to 
have failed if ex post profits (ft) fall below a certain level. Focusing on c €. % produces 
only when£(lr) = if' is certain, so tile failure rate for imitating firms is/* prffr < 0|£(-ft) 
= ir'] For each innovation, 1 - v competitors are expected. Thus (1 - v)j is the 
expected number of failures. Consider raising if 1 , e g Presumably / falls with such a 
change, but according to arguments to follow in the text, v does as well. (Why this result 
otcurs can be ignored for the moment.) No prediction on the effect of changes in 
expected profits given imitation on the number of firm failures is available without 
stronger restrictions. The increase in expected profits makes it less likely that any 
imitator will fail, but more likely that firms will subject themselves to risk by choosing 
imitation. 
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tional expenditures <€ would have to undertake to bring an imitation 
or duplicate product to market, or implement an imitation or dupli¬ 
cate process. When production occurs, these costs are sunk and can be 
interpreted as ^’s additional R & D expenditures. Then, consistent 
with the model above, 

it' = R‘ - D', 
tt ' 1 = R d - D d , 


where R J are the revenues net of variable costs for j = i, ti (derived 
below) and D y is the added development cost in each case. The R 1 are 
assumed to be common knowledge, while ^’s private information is D J 
= D f or D J torj = i, d, with D ! > D J . Thus tt' = R‘ — D', for example. 

In the case of process innovation, it is assumed that if both firms 
operate, they produce a homogeneous good, inverse demand for 
which is given by p = a — bQ, where Q = q* + q f is total output and p 
is the product price. Production occurs at constant maginal cost y* for 
$ and y and y rf for c €, depending on whether ^ chooses 1 or d, with y* 
= y d < y. Solving the duopoly problems yields 


P‘ = 
R‘ = 
P d = 


(a - 2y* + y') 2 
9b 

(a - 2y + 

9 b 

(« - = K ,i 

t\L ’ 


P‘ > P\ R‘ < R d . Since v = «"{! - [(/?' - D')/(R' - D')]\, it is 
immediate that increases in a or y and reductions in b or y all raise R‘ 
and hence lower v. T hus in particular the greater the magnitude of 
the cost reduction due to innovation (y' - y J ), the more likely it is that 
the innovation will be patented. This result arises because a greater 
cost reduction raises ^’s level of output, lowering the equilibrium 
product price and reducing the expected return to imitation. Thus 
the patent need not convey such a strong signal and, hence, may be 
allowed to occur more frequently. 

The product innovation interpretation is similar. Units of both 
goods are chosen so that marginal cost is y. If % chooses i, inverse 
demand for the innovator’s good is p* = a - bq* - rq‘ ( b , c > 0), 
where q* and q‘ are the amounts produced by $ and respectively. 
For simplicity, inverse demand for ^’s good is assumed to have the 
same parameters as does that of the new good: p' = a - bq' — cq*. If<€ 
chooses d, then p = a — b(q 9 + q d ) is the common inverse demand. 
Solving the duopoly problem gives 


P‘ = 6(a - y f 

(2 b + cf 


= R'. 


4 
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Substitution of R' into v, plus differentiation and use of the stability 
condition c - 2 b < 0, yields the predictions that increases in a and 
decreases in b, y, or r all lead to a reduction in v through raising R‘. 

This result and the analogous one concerning changes in a or b for 
process innovations raise an interesting issue. In either case, 3's ex¬ 
pected profit is vP"‘ + (1 - v)P‘. Increases in a and reductions in b, 
either of which can be thought of as augmenting the size of the 
market for ,9's output, raise both P m and P‘ under either the process 
or product interpretation. However, such changes also improve ^’s 
opportunities, leading to less frequent patenting and hence 3's receiv¬ 
ing P‘ rather than P"‘ more often. As a consequence, to the extent that 
there are activities in which $ can engage that have the effect of 
increasing the si/e of the market (advertising, loss leading on other 
goods, etc.), ,Ts being subject to incomplete patent coverage implies a 
weaker incentive to expend resources on such activities. 

There is a final point to be made concerning the way in which the 
patent’s coverage influences the frequency of patenting. If an in¬ 
crease m coverage restricts the choices available to % (i.e., in the 
absence of a patent, the characteristics chosen for the imitation would 
dif fer f rom those that could be chosen in the face of a patent), then R‘ 
is reduced by an increase in coverage. Accordingly, the expected 
profit from imitation declines and the frequency of patenting rises 
with an expansion of patent coverage. 


B. Elaborations 

Reducing the Competitor’s Profit 

A 11 assumption of the simple model is that the type of product 3 
obtains on winning the patent race cannot be influenced by any action 
3‘ takes within the R 8 c D process. However, it is reasonable to suppose 
that the innovator does have some leeway in determining the charac¬ 
teristics of the innovation he obtains. What is of particular interest is 
an analysis of 3’s incentives to try to choose his product design so as to 
make imitation more costly. 

To consider this issue, suppose that 3 can devote resources to rais¬ 
ing the probability that the profits from imitation are low; 12 that is, the 
innovator can devote resources to raising a d , subject to the constraint 
of' = 1 — a' - a' 1 . If it is assumed that (i) a minimal level of a. d (i.e.. 


lJ Analysis of attempts to change the probability that duplication is profitable, pre¬ 
sumably leaving 3\ profits unaltered, is less interesting because duplication does not 
occur in equilibrium 
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one that satisfies conditions [3] and [4]) is freely available, (ii) incre¬ 
ments to a d can be achieved at increasing marginal cost, and (iii) the 
chosen value of a d is common knowledge (so 3’s informational advan¬ 
tage remains as before), it is easily shown that the basic character of 
the equilibrium is undisturbed. This occurs simply because, given a' 1 , 
the game is unchanged. 

Proceeding in this fashion, minor manipulation yields 3’s expected 
profit as 

vP m + (1 - v)P‘ - x(<A (7) 

where \(<x d ) is expenditures on increasing a d above the minimal level; 
X' — 0, x" > 0- Raising a d augments v, thereby increasing the probabil¬ 
ity of receiving P m as opposed to P‘. The optimal a' 1 balances these 
marginal returns against marginal cost x'- 

The predictions are as follows. First consider parameters entering 
(7) via ii only: (ir‘, -if'). Increases in tt' or tt‘ reduce v for fixed a' 1 . Such 
changes reduce marginal returns, thus lowering the optima] a d . Since 
v is proportional to a d , the full effect (a d varying) of these parameters 
on v is the same as the impact for given a d . For these parameters, (i) 
the predictions discussed above stand qualitatively unaltered, and (ii) 
parameter changes making i more profitable when % is confronted 
with a patent reduce the level of activities devoted to raising the 
probability that i is more costly. 

An increase in P m raises marginal returns directly (not via P); op¬ 
timal a d rises accordingly. Note that whereas above v was not a func¬ 
tion of P m , v rises with P m here. An increase in P‘ operates in just the 
opposite fashion. 


T rade Secrecy 

Thus far patenting has revealed information only to the extent that To 
obtains an improved estimate of the profitability of various options 
through inferences based on correct conjectures regarding ^ s pat¬ 
enting policy. Naturally patents may also make it less costly to pro¬ 
duce an imitation. For example, the patent may reveal that some- 
production processes work better than others. For these additional 
reasons, 3 may not wish to patent; that is, “trade secrecy" could de¬ 
velop. Does the model predict this? 

Suppose that if 3 patents, imitation generates it' + h (still random); 
otherwise imitation yields -rr' as above. Further, assume that the exis¬ 
tence of a patent does not ensure that imitation is always profitable: 

tt' + k < 0. (8) 
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condition (8) keeps the analysis within the spirit of that presented in 

Then, proceeding as above, the equilibrium is identical except that 
it' and it' are replaced by tt‘ + k and tt' + k. That is, the equilibrium is 
just as in Section I, except that a h f> h + a*S' is lower. This impltes that 
the propensity to patent (p) is smaller, or trade secrecy. 

f’he explanation for this result is that were $ to continue patenting 
at the same rate as when k = 0 (i.e., when the patent system conveyed 
no direct cost information), % would obtain enough profit-increasing 
information to make it worthwhile, on average, to imitate when con¬ 
fronted with a patent. Given this, $ optimally reduces a A 8 A + a'S' in 
order for the occurrence of a patent to signal more forcefully that 
imitation is unprofitable. Thus, 3> uses the signaling aspect of the 
patent system to offset the fact that patenting reduces costs for im¬ 
itators. The result is a lower propensity to patent and a resultant 
increase in 3’s resorting to trade secrecy. 

A Patent Race 

As outlined above, the full model is a three-stage game: (i) a patent 
rate yielding private information to the winner, (ii) the patenting 
game, analyzed above, and (iii) a textbook duopoly or monopoly de¬ 
pending on the outcome of the patenting game. This subsection pre¬ 
sents predictions about the patent race. 

In the literature on patent races (e.g., Mortensen 1982), it is as¬ 
sumed that the value of the innovation is known. In the present 
context this entity is a random variable because the firms engaging in 
the patent race face uncertainty concerning the private information 
that will emerge at the end of the race. This lack of information is the 
source of ihe new predictions. 

1 he model of the patent race is a very stylized one in which the 
dynamic elements of the race are suppressed, and it owes much to the 
analysis in Lazear and Rosen (1981). Specifically, there are two firms 
engaging in research. Research requires an investment of resources 
(f indexes firms,/ = 1,2) and yields output q t according to q t = r f + t/, 
where *./ is a random variable with density and the y are inde¬ 
pendently distributed research luck. 

For simplicity, suppose the firm having the largest research output 
always succeeds, while the other fails. Then, for firm/the probability 
of winning the patent race is {/ ^ k) 


,s Note that the discussion holds P'". P‘, and P d fixed, and so k is best thought of as a 
reduction in %'s expected additional development costs. A more general analysis mak¬ 
ing the same point can be constructed 
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* pr(?/>?*) 

* P r ( r / “ r * > V - e*) 

= G(r r - r*), 

where G(-) is the cumulative distribution function of t r - define G' 
= g- 

Let £(P) and £(n) denote expected profit given success (hence oc¬ 
cupying the role of in the patent game) and failure (hence becom¬ 
ing c €) in the patent race. Assuming that the cost of r f is K(.r f ), K' > 0, 
K" > 0, expected profit for firm i is 

% r (r f , r k )E(P) + [1 - if(r f ,r k )]E( tt) - K(r f ). 

In a symmetric Nash equilibrium, = r k , and the common level of 
research resources r is determined by g(0) • Air = A"(r), where Ait = 
E{P) — £(it) is the “win-lose spread.” The predictions about the level 
of resources devoted to innovative activity can therefore be obtained 
through examination of the net return to winning, A-rr. 

Substitution of the equilibrium value of v into A-jt yields the result 
that resources devoted to R 8c D rise with increments to P m and a rf and 
fall with additions to it' or -n‘ (or the parameters of it' introduced in 
the analysis of process and product innovation). 

These results are hardly earth shattering. More significant is the 
implied correlation between patenting and research activity induced 
by movements in the underlying exogenous variables. While increases 
in P"‘ raise r and do not influence P, all other parameter changes cause 
r and P to change in the same direction. Thus a basic and coarse 
prediction of the model is that the level of R & D expenditure and 
patenting activity are positively correlated. This is indeed what is 
found in the data (see Pakes and Griliches 1980; Bound et al. 1982). 
Moreover, that all parameters influence both r and v in the same way 
implies that if r is held constant, P should vary but not significantly, 
which is one inference that can be drawn from the empirical findings 
in Pakes and Griliches. This result occurs because, except under ex¬ 
treme parameterizations, it is not possible to change two parameters 
so as to hold r fixed and simultaneously induce large variation in P. 


A Welfare Result 

Along with all of the usual considerations that obtain in any problem 
with patents, an additional issue arises in this model with the in¬ 
troduction of incomplete information. Specifically, it is of interest to 
consider whether the patent system conveys an efficient amount of 
information. A natural conjecture is that in equilibrium it does not. 
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Condition ( 8 ) keeps the analysis within the spirit of that presented in 
Section II. n 

Then, proceeding as above, the equilibrium is identical except that 
it' and •ft' are replaced by it' + k and it' + k. That is, the equilibrium is 
just as in Section I, except that a!'V‘ + a'S‘ is lower. This implies that 
the propensity to patent (v) is smaller, or trade secrecy. 

The explanation for this result is that were 3 to continue patenting 
at the same rate as when k = 0 (i.e., when the patent system conveyed 
no direct cost information), 80 would obtain enough profit-increasing 
information to make it worthwhile, on average, to imitate when con¬ 
fronted with a patent. Given this, 3 optimally reduces a*8 A + ot'S' in 
order for the occurrence of a patent to signal more forcefully that 
imitation is unprofitable. Thus, 3 uses the signaling aspect of the 
patent system to offset the fact that patenting reduces costs for im¬ 
itators. The result is a lower propensity to patent and a resultant 
increase in ^’s resorting to trade secrecy. 


A Patent Race 

As outlined above, the full model is a three-stage game: (i) a patent 
race yielding private information to the winner, (ii) the patenting 
game, analyzed above, and (iii) a textbook duopoly or monopoly de¬ 
pending on the outcome of the patenting game. This subsection pre¬ 
sents predictions about the patent race. 

In the literature on patent races (e g., Mortensen 1982), it is as¬ 
sumed that the value of the innovation is known. In the present 
context this entity is a random variable because the firms engaging in 
the patent race face uncertainty concerning the private information 
that will emerge at the end of the race. This lack of information is the 
source of the new predictions. 

The model of the patent race is a very stylized one in which the 
dynamic elements of the race are suppressed, and it owes much to the 
analysis in Lazear and Rosen (1981). Specifically, there are two firms 
engaging in research. Research requires an investment of resources r f 
(/indexes firms, f ~ 1 , 2 ) and yields output q f according to (ft = r f + ey, 
where €/ is a random variable with density and the are inde¬ 
pendently distributed research luck. 

For simplicity, suppose the firm having the largest research output 
always succeeds, while the other fails. Then, for firm f the probability 
of winning the patent race is (/ # k) 


11 Note that the discussion holds P". P‘, and P d tixed, and so k is best thought ot as a 
reduction in Ts expected additional development costs A more general analysis mak¬ 
ing the same point can be constructed. 
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kfiX f , r k ) = pr (qj > q k ) 

= pr(r, - r k > e. f - e*) 

= G(r f - r k ), 

where G( ) is the cumulative distribution function of ey — £*; define G' 

Let £(T) and £(ir) denote expected profit given success (hence oc¬ 
cupying the role of 3> in the patent game) and failure (hence becom¬ 
ing c €) in the patent race. Assuming that the cost of r f is K(r f ), K' > 0, 
K" > 0, expected profit for firm i is 

4/(ty> r k )E(P) + [1 - r k )]E(Tt) - K(r f ). 

In a symmetric Nash equilibrium, ry = r k , and the common level of 
research resources r is determined by g(0) • Air = K'(r), where Art = 
E(P) — E(-n) is the “win-lose spread.” The predictions about the level 
of resources devoted to innovative activity can therefore be obtained 
through examination of the net return to winning, Ait. 

Substitution of the equilibrium value of P into Air yields the result 
that resources devoted to R & D rise with increments to P m and a rf and 
fall with additions to tt' or V (or the parameters of it' introduced in 
the analysis of process and product innovation). 

These results are hardly earth shattering. More significant is the 
implied correlation between patenting and research activity induced 
by movements in the underlying exogenous variables. While increases 
in P' n raise r and do not influence P, all other parameter changes cause 
r arid P to change in the same direction. Thus a basic and coarse 
prediction of the model is that the level of R 8c D expenditure and 
patenting activity are positively correlated. This is indeed what is 
found in the data (see Pakes and Griliches 1980; Bound et al. 1982). 
Moreover, that all parameters influence both r and P in the same way 
implies that if r is held constant, P should vary but not significantly, 
which is one inference that can be drawn from the empirical findings 
in Pakes and Griliches. This result occurs because, except under ex¬ 
treme parameterizations, it is not possible to change two parameters 
so as to hold r fixed and simultaneously induce large variation in P. 


A Welfare Result 

Along with all of the usual considerations that obtain in any problem 
with patents, an additional issue arises in this model with the in¬ 
troduction of intomplete information. Specifically, it is of interest to 
consider whether the patent system conveys an efficient amount of 
information. A natural conjecture is that in equilibrium if does not, 
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and that requiring full revelation would be welfare improving. The 
basis for this conjecture is simply that, in equilibrium, $ is able to 
prevent entry and reap monopoly profits in some situations in which, 
with full revelation, entry and imitation would occur. 

Although this argument has intuitive appeal, it is not generally 
correct so long as imitation involves a positive additional development 
cost. If there were no such cost, then usual social surplus arguments 
could be used to show that full revelation would be strictly welfare 
improving. With these costs, however, the value of lower prices and 
increased variety must be weighed against the incursion of the cost 
needed to acquire these benefits. This is the usual conflict arising in 
product differentiation problems. Whether the surplus arising from 
the revelation of the additional information more than offsets the 
costs created by the new information depends on the usual group of 
factors (demand elasticities, size of the fixed cost, etc.). Note that this 
result holds irrespective of whether the resources devoted to the pat¬ 
ent race would respond to the requirement of full revelation. 

III. Discussion of Assumptions 

In the analysis presented in Sections I and II, a number of specific 
assumptions were made. This section asks whether the analysis de¬ 
pends on these in any crucial way. In addition, the leader-follower 
equilibrium concept is discussed in some detail. 

A. Relaxation of a 1 = 0, or Condition (5) 

The assumption a 1 — 0 guarantees that at least one activity in which 
might engage is always profitable. It is not hard to see that the restric¬ 
tion a 1 = 0 is inessential. 

When '€ does not observe a patent, even with a' = 0, i dominates d\ 
this is all the more so for a! > 0. If a patent occurs, $’s problem is still 
to communicate to ‘C that i is unprofitable. The assumption a! > 0 
only renders this task easier. Setting a' = 0 is thus a useful 
simplification. 

As regards (5), the "additional development cost” model presented 
in Section HA suggests that while (5) is not an unreasonable restric¬ 
tion, it could be that the additional development costs are so much 
lower for d that (5) fails. This appears more likely under the process 
innovation interpretation. 

Analysis similar to that above yields the following conclusion. Fail¬ 
ure of (5) does not change v unless the following three inequalities 
hold simultaneously: 
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Epn'^n 11 = -n d ) > 0 , 

[(1 — 8)a A + a']1t' < (1 — t>)a h H d + a ‘TT d , 

and 

P' > [a d + §(«* + ot‘)]P m + (1 - S)(a A + a ')P d , 

where 


and 

« ot'V 

(1 - a d )*'' 

When these inequalities all hold ^’s incentive to duplicate is so strong 
that v = 1 is the equilibrium outcome, with invariably imitating 
(recall condition [3]). When a subset of the inequalities hold, while % 
always chooses to abstain when conf ronted with a patent, as would be 
expected, d can become the optimal response when 3> does not patent. 

In sum, when (5) is violated i’s behavior will not in general change, 
while % may choose to duplicate when possible. The basic character of 
the model as well as the predictions for the propensity to patent—in 
particular the propensity to patent/R & D expenditure relationship— 
are not altered except in the situation defined above. In the context of 
the additional development cost model, this is more likely for process 
innovation since R d > R'. In the product innovation case, the costs of 
duplication must be so low relative to those for imitation and bulk 
sufficiently large in (it', -n d ) that the revenue advantage from product 
differentiation is overwhelmed. In such circumstances, .9 understands 
that not patenting results in duplication, and so chooses the policy A 
= ( 1 , 1 , 1 ). 

B. Relaxing Condition (3) or (4) 

When either (3) or (4) does not hold, equilibria other than that dis¬ 
played in the proposition may occur. 

Take failure of (4) first. This alteration allows the possibility of an 
equilibrium in which d is dominated by n when there is no patent. 
Accordingly the patent has no role except as an information transfer 
mechanism. Since p* = 1 — p can convey the same information as p, 
this equilibrium has $ choosing the policy A* = (1 — , 1 — 5', 0), 

with % responding by choosing i when confronted by a patent and n 
otherwise. The existence of this "mirror-image” equilibrium is not a 
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matter of concern simply because (4) guarantees that the patent law 
itself serves a real purpose. 

When (3) fails, it is easy to check that the rule h' 1 = 5' = h 1 ' = 1, with 
^ always choosing n, is an equilibrium; that is, always patenting com¬ 
pletely deters %. The logic is just that always patenting prevents dupli¬ 
cation and reveals no information; hence ‘fi’s expected profit from 
imitation is E{-n‘) < 0 and ^ always prefers n to i. Further, it is possible 
to show that when (3) fails, there is a continuum of equilibria de¬ 
scribed by the (demonstrably nonempty) set of 8 rf , 8^, and 5' satisfying 

£(t t'|p = 0) ^ 0, 

£'(ti'|p = 1) =£ 0, 

and 

£(Tt''|p 0) s 0. 

where f'.( ) defines f £’s profit expectations under the posterior distri¬ 
bution generated by (8^, 8\ 8'). The logic here is just that if imitation is 
a poor option, 3 can use the patent to deter duplication without fear 
of signaling that imitation might be profitable. The deck is stacked in 
Fs lavoi. I he same is true when both (3) and (4) fail. 

C Die Equilibrium Concept 

In the equilibrium presented above, ^’s entry rule specifies that he 
not enter whenever 3 patents an innovation, which immediately raises 
the question: Why does 3 not patent with probability one, thus earn¬ 
ing P”‘ for certain? The most direct answer is that not doing so is a 
basic feature of the leader-follower equilibrium concept; the leader’s 
strategy is not requited to be a “best reply” to the strategy of the 
follower. 

The leader-follower equilibrium is employed here for both its 
simplicity and its predictive content. The reader who finds this equi¬ 
librium notion disconcerting because of the failure of the best-reply 
property should be reassured by the following two facts. First, the 
equilibrium propensity to patent derived above can arise in a Nash 
equilibrium of the model. The equilibrium strategies for % and 3 will 
he different but will still involve 3 's patenting only some fraction of 
the time. This equilibrium is no longer unique, however. 11 More in¬ 
teresting, the equilibrium strategies of Section I emerge as a sequen¬ 
tial (perfect Nash) equilibrium in a slightly modified version of the 


3 's patenting with probability v and %'i always imitating is a Nash equilibrium for 
any v 
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game. The modification is to the way in which one thinks about the 
firm’s patenting decision and involves thinking of the firm as deciding 
each period on what policy (i.e., A triple) it will employ in making its 
patent/no patent decision once the innovation is obtained. 

Because it is expositionally simpler, the leader-follower equilibrium 
was adopted above, the point being to focus attention on the model’s 
predictions rather than on the technical details of the game. 


IV. Summary 

This paper has considered a model of patenting behavior when pat¬ 
ents reveal information important to competitors. It was shown that 
the equilibrium in such a model involves a patent always occurring 
when the competitor would find imitation unprofitable, and some¬ 
times when imitation would be profitable. The competitor's optimal 
response to this behavior is to abstain entirely from production when 
a good or process is patented and to produce a differentiated product 
(or use a related process) when a patent does not occur. 

This equilibrium behavior implies a propensity to patent about 
which many predictions can be made. First, as is found empirically, 
the propensity to patent is not unity. Further, any factors making 
imitation a more attractive strategy in the face of a patent make it less 
likely that a patent will be chosen. This result occurs because under 
stub a change the decision to patent must convey stronger informa¬ 
tion to competitors; that is, patenting must be a more surprising 
event. 

The model was extended in several directions, including explicit 
treatment of the product or process innovation interpretations, trade 
secrecy, possibilities for welfare improvement through mandated rev¬ 
elation of private information, possibilities for attempts to reduce 
prohts directly for competitors, and an innovation race. 

Finally, the impact of relaxation of several of the model's assump¬ 
tions was considered, and it was argued that the model could be given 
a Nash interpretation that is consistent with the results obtained. 


1 ’ Formally, rather than assuming the strategy space (patent, no patent) as in the text, 
the modified game would have a strategy space consisting of A triples. Imagine, for 
instance, that each period a potential innovator decides on a policy for determining his 
patenting behavior. Having learned the profitability of its innovation, 3 would use his 
policy to make a patent/no patent decision this period. The policy would in no wav 
commit 3 to any future patenting decisions. Having observed 3>'s patent decision, T 
would then have to make an entry choice. He would do this by determining an entry 
rule incorporating 3“s patent decision and the fact that it was generated by some (not 
necessarily observed) patenting policy. Then, the equilibrium strategies of the proposi¬ 
tion constitute a sequential equilibrium in this game. Thanks are due a referee for 
pointing out this alternative specification 



8 6 JOURNAL OF POLITICAL ECONOMY 

Appendix 

To facilitate proof of (he proposition, the following additional notation is 
introduced. 

rr"jp. A) = %’s posterior probability distribution of (ir', ir rf ) for a given 
p and cnnjectuied patent polity A; 

E*( tt'| p) = 's expected profit from imitation under the posterior 
distribution <t>( •) given p; 

/.'*’(u''|p) = %'s expected proht from duplication under the posterior 
distribution <t>( ) given p. 

Suite, in equilibrium. it must he that %’s conjectured A equals the actual A 
chosen b\ 3, this equality is assumed throughout the proof to simplify the 
notation 

Ptnof of P>opo\ttu»i 

It is first shown that, given A, the asserted C(-) is %’s best response. To see this 
note that - 0, and so C"( A. 1) = 1. Further, £* ,, (iTj 0) = E(t t*|- tr' = 

if') > f.('rr' , |-rr' — ir 1 ) = E* (ir' , |0), recalling (5), and /^(irjO) = ir 1 > 0, so C'(A, 

(» - 1 . 

We now show that, given %’s response to any A chosen by 3, the asserted S 
maximizes 3\ expected profits. In fact, 5 yields 3 an expected profit of 

<«'' 4 a!'l h + a'S'+ [(1 — S‘)«* + (1 - l h )a h ]P‘ > 0. 

As this is tt weighted average of the two largest possible payoffs for 3, 3 can 
do better only by one of ((/) raising the probability of achieving V m while 
retaining (.''(A, 0) = 0 — C d ( A, 1) or (b) allowing tor some duplication. This 
will involve C"(A, 1) - 1 = C'( A, 0), since it is easy to (heck that 5 dominates 
,tnv strategy, A, that would result in either C'(A, 0) = 1 = C'(A, 1) or C'(A, 1) = 

1 = <7'(A, 0). 

focus on case a. First, any change in A that would result in C( A. 0) = 1 = 

<7 (A, 1) obviously reduces the probability of 3's receiving P m and so could not 
be optim.il. Also, C"(A, 0) = 1 = C"(A, 1) is obviously not possible. So consider 
the “mixed” cases. First, suppose the competitor retains f."(A, 1) = 0 and 
C (A, 0) - 1. Ftie maximum ptofit lot 3 given that this remains %’s best 
t espouse is obtained as the solution to the problem 

max 8*a* 4 S'a' 4 8'V = v 

subject to 

(&V‘ 4 6'a')fr' 4 8 rf aV s 0, i'VlD £ 0, 

[(1 - 8*)a* 4(1- 8 (/ )a' 1 j-r' 4 (I - 8')aV < [(1 - 8'ja* 

4 (1 - 5’)<*’]*' 4 <1 - 5'VV, 

/•;‘ t, (iT''|0) < fi'^firjO). 

Now solve the problem ignoring the second constraint. The first constraint 
is relaxed by an increase in 8 d while the maximand increases in 8 rf . So = 1 is 
optimal. As the maximand is rising in h h a h 4 8'ot\ the best that can be 
achieved is to set S^c/ 4 8'a' such that the first constraint holds as an equality. 
(That a* and a' are not too small to cause 8* = 8' = l to yield the constraint 
not binding is ruled out by E[ it'] < 0.) Next, For 8^ = 1, (5) implies that the 
second constraint is nonbinding for the A satisfying the first constraint as an 
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so ' vcs t J 1e fuH problem. Finally, since the Inst constraint 
+ o& = -a rr'/ir*. Thus, if C"(A, 1) = 0 and C(A, 0) = l,(g\ 
imal patent policy for 5. 

Still referring to case a, there is another possibility, namely, that C"( A, 0) — 

1 = C'(A, l). The auxiliary problem is now 

max( 1 - 8*)a* + (1 - 8*) a ' + (1 - b J )a'‘ 

subject to 

[(1 - 8* )a A + (1 - 8' )a' ]tt* + (1 - 8 rf )aV s 0, |0) s (), 

and 

t(l - 8*)«* + (1 - 8 rf )a < ']ff' / + (1 - 6jaV' £ (), E‘* > (tt' , | 0) s 0. 

Exit'll) a 0 is implied when the first constraint is satisfied. 

Again, ignore the second constraint. Since the problem is identical to the 
one just worked out, except that “patent" is replaced by "no patent," the 
innovator cannot do better: h' 1 — () is optimal in this problem. The first 
constraint can be written (as an equality) 

OL^Tl' 

(1 - ^)a h + (1 - 8')a' --- < 1 by E(rt') > 0. 

it' 

Now consider the left-hand side of the second constraint for 8' = 8'. S'' = 0 
= 8 ri : 

(a* + ct rf yn'' + (1 - g'jaV* > (a' 1 + a'')*'' + 

> cx'‘tt'' -I- a'rr'' 

2: 0 

(by [4)). Similarly, the left-hand side of the second constraint 5' = 0 = 8'',5 /l = 
8* yields 

[(1 - 8'* In'* + a d ]* d + ay> «■%'' + aV 1 > I) 

(by [4]). Therefore, no 8', 8* pair satisfying the first constraint as an equality 
can satisfy the second. Thus the maximal value of the maximand must be 
lower than under the previous problem. There is, therefore, no way of raising 
the probability of receiving P m above v while retaining C (A, 0) = 0 = 
C d (A, 1). 

This just leaves case b then: C"(A, 1) = 1 = C? (A, 0). The auxiliary problem 
is 

max h h a b + 8'a' + 8 J a rf 

subject to 

(8V + 8'a'Jit 1 + 8 'Vtt' < 0, E*(- ir'|l) s 0 , 

[(1 - 8'')a A + (1 - 8 ,y )ct‘ jfr'^ + (1 - 8')a'r f 1 
2 [(1 - 6 A )a A 4- (1 - 8')a'}if' 

+ (1 - S'V'V- E'^-it''|0) 2 /^(tt'IO). 

Again, ignoring the second constraint, f> d = 1 follows, yielding at best the 
same value of 8"a A + 8'a' + S d a d as before. But this means that expected profit 
involves P m no more often and P d (< P‘) at least as often as under 5. Thus 
expected profit is lower than under S. Q.E.D. 


equality. Thus, 
is binding a^S* 
5', 1) is the opt 
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The Financial Firm: Production with Monetary 
and Nonmonetary Goods 


Diana Hancock 

University of Santa Clara 


A microeconomic theory of the financial firm is developed that is 
empirically testable. Financial firms are deposit-taking inter¬ 
mediaries issuing their own liabilities, exemplified by banks. User 
costs are derived for monetary goods such as demand and time 
deposits. Front the variable profit function, demands for and sup¬ 
plies of monetary and nonmonetary goods are derived. A sample of 
New York and New Jersey banks indicates that regularity conditions 
in production are satisfied. The financial technology is relatively 
inflexible for monetary goods, but less so for nonmonetary goods. 


I. Introduction 

Monetary theory and polity affect the economy through financial 
firms, profit-maximizing entities producing intermediation services 
between borrowers and lenders. These firms, exemplified by banks, 
use in production the services of monetary goods such as cash, de¬ 
mand and time deposits, other financial goods such as loans, and 
physical goods such as labor and materials. Financial firms may face 
regulations on reserve requirements, deposit insurance, and interest 
rates. Examination of the costs and benefits of these regulations, as 
well as the analysis of monetary policy, requires a microeconomic 
theory of the financial firm. 1 


I am grateful to Erwin Diewerl, Melvyn Fuss, John Murray, C. W. Sealey, Jr., Wayne 
Lee, and three referees for their comments and suggestions. Daia were supplied by the 
Federal Reserve Bank ol New York with the kind assistance of Carl Allen and Kenneth 
Behrens. 

1 Tobin (1961, p. 26). Hicks (1935) argues for the application of marginal analysis to 
monetary theory. 
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Estimation of financial technologies provides information on the 
demand for and supply of monetary goods and their ease of substitut¬ 
ability and transformation in production. To estimate a microeco¬ 
nomic model of financial production, user costs for monetary and 
nonmonetary goods are developed. The financial firm maximizes 
variable profit, total revenue less variable cost. The variable profit 
function depends on prices of services, derived from the user costs, 
and (he quantity of capital. Changes in interest rates, reserve require¬ 
ments, and other regulations affect user costs and, through input 
demands and output supplies, markets for monetary and nonmone¬ 
tary goods. The primary focus is on empirical application, specifically 
to a longitudinal sample of banks in Federal Reserve District 2 (New 
York and New Jersey). 

In Section II, user costs are dervied for six goods: loans, demand 
deposits, cash, time deposits, materials, and labor, with capital fixed. 
The estimating structure is described in Section III. Regularity tests 
for monotonicity and convexity are applied to the variable profit 
function. Section IV reports elasticities of transformation and supply 
of and demand for monetary and nonmonetary goods. A monetary 
and regulatory policy analysis is conducted in Section V for changes 
in interest rates, deposit insurance rates, and reserve requirements. 

The financial technology is relatively inflexible, implying that inter¬ 
est rate increases must be severe to restrict monetary production. At 
the geometric sample mean, the own compensated price elasticity of 
supply for loans is 0.5IGO and that for demand deposits 0.4164. 
There is low monetary substitutability, as all cross-price elasticities 
between cash and demand and time deposits are less than unity. 2 
Monetary indices such as M1 reported as simple sums can be mislead¬ 
ing, since they require perfect substitution between components, and 
this is not supported. ’ The results indicate that it is possible to imple¬ 
ment a model of production including monetary and other financial 
goods, in addition to the more conventional physical resources of 
labor, capital, and materials. 


* Monetary substitution on the demand side is also limited. Barnett (1981, pp. 351 — 
:">3), Klein (1974, p. 940), and Otfenbacher (1980, p. 56) obtain estimates of partial 
elasticities of substitution close to zero between monetary goods held by consumers. 

1 I he simple sum index is formed by adding together unweighted dollar balances of 
monetary goods. Friedman and Schwartz (1970, pp. 151-52; 1982) term the character¬ 
istics o! money, including risk and liquidity, "moneyness.” The simple sum form im¬ 
plies that there is no difference in moneyness between monetary goods. Monetary 
subaggregates that do not impose one-for-one substitutability, based on user cost 
weighis, have been derived by Barnett (1980) and Barnett, Offenbacher, and Spindt 
(1984). 
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II. Production in the Financial Firm 

A. Variable Profit 

Let quantities of variable goods be x = (*,. . . x*_ ,), with capital x* 

fixed in the short run. Variable goods include the services of assets, 
liabilities, and items off the balance sheet, such as labor. Outputs are 

*i. Xj and inputs Xj + |, . . . , x K , with positive prices V = (V|, . . . , 

Vk - i). although whether a good is an output or an input is not known 
a priori. Outputs are measured positively and inputs negatively. 

Variable profit, total revenue less variable cost, is g = Vx. The 
financial firm maximizes variable profit subject to * being in S, the 
production possibility set. The variable profit function is 

tt*(V, x K ) = max{Vx: x6S). (1) 


Since the variable profit function is linearly homogeneous in prices, it 
can be expressed as tx{v, x k ), where it = tt */V t and v - V/V,, with the 
price of the /th good a numeraire. 

Supplies of outputs and demands for inputs are 


x, 


3tt(v, x k ) 
dv, 


l = 1. K - 1, I 5* /. 


( 2 ) 


The relative expenditure on an input or revenue from an output is 
ty:,/ir = 3 In -n(v, x*)/3 In v„ t = 1, . . . , K — 1. Estimation of ir(r, x K ), 
the normalized version of (1), and the supply and demand functions, 
either as (2) or in relative expenditure forms, requires user costs and 
prices for physical and financial goods. 

User costs and prices for monetary goods are derived from an 
intertemporal model of financial production. Let B, be expenditures 
on variable physical goods, labor, and materials in period t and P, be a 
general price index. The real balance of financial good / is y,, ( and the 

holding cost or revenue per dollar per period is h,, where t = 1. L 

for liabilities and i = L 4- 1, . . ., L + A for assets. Holding costs or 
revenues are contracted for on the initial balance but paid or received 
at the end of the period. 

The total net cost of services to the firm produced by liability i 
during period t is (1 -t- A,.,- Oy.,1- \P t -\ ~ y,jPt■ The first term is the 
initial nominal liability y, wl -\P t - lf plus holding costs or revenues in¬ 
curred at unit rate From this is subtracted the total nominal 

liability to depositors at the end of the period, y]jP,. On an asset such 
as a loan, flows are received at the end of the period, and the financial 
firm is effectively repaid its outstanding balance at the end of the 
period. Variable profit during period t is 
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i. + a 

g , = -a, - X b 'f (1 + ~ y<< p <^ 

I « I 

where A, = 1 ifi = 1 .Land A, = - 1 if i = L + J,. . . , Z- + /{.The 

/.ill liability is capita], fixed in the short run, so b/ — 0.1 he variable 
profit function for period / is the maximum of# over the quantities of 
physical and financial goods. Increases in labor and materials costs 
reduce variable profit, as do increases in the costs of servicing mone¬ 
tary liabilities. 

The factor p, discounts cumulatively overj periods, or p, = f[{_i 
1/(1 + //,), where /?, is the discounting rate in period s, and /?, = 0 if s 
= j. I he capitalized value of total profit over / = 2. T is 

./ t / c + a 

X P>Z> = - X X X M,[(l +A w _,^ Ii/ 

1 = 2 <-2 c - 2 i - 1 

(4) 

This is analogous to (3), with capital both variable and included as a 
liability. I he coefficients of y, in (4) are user revenues, the effects on 
profits of unit dollar increases. Their negatives are user costs 11, , for 
financial goods or, for each t = 2.7\ 

), »=1-- L + A, i * L. (5) 


For the variable profit function, in real terms 


T,., 

r, 


h,.r - 1{ < 

I + R, ' 

R, - h,., 

1 + R, ' 


i = 1 . L - 1 , 

i = L + 1 ./. + , 4 . 


( 6 ) 


Since if , > 0 variable profit is reduced and if U,, < 0 it is increased, 
the former condition classifies good / as an input and the lattei as an 
output. As tfie discounting rate increases, liability items such as de¬ 
mand and time deposits are more likely to be classified as outputs. 
The user costs of assets such as loans are reduced, as the R, - h,,, 
become more positive, and these are more likely to be classified as 
inputs. 

With this classification, a change of variables from ((/, „ y,,) to {V,„ 
x,j) can be performed. The former pairs are user costs and nonnega¬ 
tive real balances. The latter pairs are strictly positive prices and posi¬ 
tive output and negative input quantities.' The change of variables is 


1 This permits adoption ot the convention on signs for input and output prices and 
quantities of Debreu (1959, p. 38). 



the financial firm 

V,.t - the absolute value, and x , 
it' V,j > 0. 
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y,.< if u,., < 0 , with x,., = -y,, 


B. User Costs and Data 

User Costs: Monetary and Nonmonetary Goods 

For application to monetary and other financial goods, the holding 
costs h, (with the time subscript suppressed) are specialized to include 
interest rates, reserve requirements, and service charges. These hold¬ 
ing costs are derived separately for asset and liability items. 

For the services of $ 1.00 of an asset per period, interest rate r,, 1 - /. 

+ 1. L + A, is earned. Service charges, including late loan 

payments and stand-by charges, are received at rate s,, and capital 
gains are at rate r,. Provision for loan losses and insurance premia is at 
rate cl,. Hence h, = r, + c, + s, — d, is the one-period holding revenue 
per dollar for an asset. ’ 

The real user cost of the services of a financial asset is from (b): 



K - li, 

1 + R 

_ 1 + r, + r, + . 1 , - d, 
1 + R 


0 ) 

1 = 1. + 1. L + A. 


For liability t = 1, .... L - 1, let r, be the interest rate payable, d, the 
deposit insurance premium rale, and s, the service charge earned per 
dollar. If k, is the reserve requirement on liability 1 , for each dollar 
deposited (1 - k,) is available for usage by the firm. The lax imposed 
by the reserve requirement is Rk,. The one-period holding cost per 
dollar of liability i is h, = r, + Rk, - s, + cl,. A depositor, on with¬ 
drawal of $1.00, receives (1 - k,) of claim on the financial firm and k, 
front required reserves. 

'I'he real user cost per dollar of the services of a liability is 

V, _ h, - R 
~P 1 + R 


= - (1 - A,) + 


1 — k, + r, — s, + d, 
___ 


( 8 ) 


= — 1 + 


I + r, + Rk, — s, + d, 
1 + R 


i=l,. 


L - l. 1 ’ 


r ' T his is analogous to the user cost tor assets held by consumers in Barnett (1981, pp. 
195-97). 

6 In continuous time, the user cost per dollar of an asset is U, — R — h,, 1 = L + 

l, .... L + A, and for a liability U, = h, - R, t = 1. L — 1. with receipts and 

payments occurring instantaneously. 
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Data: Monetary and Other Financial Goods 

The principal data source is the Functional Cost Analysis (FCA), an 
annual survey conducted by the Federal Reserve. The data are 1973— 
78 longitudinal observations on the balance sheet, profit and loss 
statement, and employment for 18 New York—New Jersey banks, all 
members of Federal Reserve District 2. 

The discounting rate R„ s = 1973, .... 1978, is selected to satisfy 
the feasibility condition that variable profits be nonnegative each year. 
The one used is the highest interest rate satisfying the feasibility con¬ 
dition. The set of interest rates tested for feasibility contains all those 
either paid on deposits or received on loans by any sample bank in a 
given year. The rates vary by year and range from 4.5 to 5.1 percent. 7 
Any rate sufficiently high to affect the classification of inputs and 
outputs is infeasible, with variable profit negative for at least one 
sample bank in that year. Lower discounting rates than those used do 
not alter the classification. 

For calculation of the holding costs h„ data are available on s„ and 
d, for five types of loans receivable: investments and securities, real 
estate mortgages, installment loans, credit card loans, and commer¬ 
cial, agricultural, and other loans. Capital gains and losses c, are in¬ 
cluded for investments and securities on a realized basis. The implicit 
pric e deflator for finance and insurance in the national accounts is¬ 
sues of the Survey of Current Business is used for the price index P. This 
permits data on financial quantities to be constructed as real balances 
and user costs of monetary and nonmonetary goods to be expressed 
in nominal terms. Financial goods quantities are measured in real 
terms, analogous to those for physical goods. 

Among assets, all loan categories have negative V, for the sample. 
The user cost of cash is f /2 = PRI{ 1 + R ), since r, c, s, and d are zero. 
Let V 2 = |(4<| be the price of cash. The corresponding real balance for 
cash, Xy, is the amount of real excess reserves held above required 
reserves and is negative, since cash reserves are an input. 

Demand and time deposits are goods 3 and 4. The interest rate, 
service charges, and deposit insurance premium rates paid to the 
Federal Deposit Insurance Corporation (FD1C) are from the FCA. 8 

7 Negative variable profit can be consistent with short-run operation if there are 
transactions costs and barriers to entry and exit. Once this is permuted, the discounting 
rate and user costs become arbitrary, as short-run losses can continue without limit. 
Since the observational period is 1 year, it is plausible to require that variable costs over 
such a duration be covered. The discounting rates obtained are typical of interest rates 
obtaining during the sample period, such as in Wilcox (1983). There il is argued that 
supply shocks such as for energy caused real interest rates to be low and possibly 
negative in the 1970s by reducing the marginal productivity of capital. 

" The interest rate used on both time and demand deposits is that paid, an explicit 
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Reserve requirements are obtained from the Federal Reserve Bulletin. 
For all observations, C/ 3 is negative and U 4 positive, so demand deposit 
services are an output and time deposit services an input. 9 Quantities 
are dollar amounts on the balance sheet deflated by the price index, 
with x$ positive and x 4 negative. The prices are F 3 = |£/ 3 | and V 4 = 
|t/ 4 |. 

There are two other liability categories, for borrowed and pur¬ 
chased funds and nondeposit funds, and user costs are constructed 
for these. A Tornquist index of net loans, with price, is constructed 
from the five types of loan assets and two nondeposit liabilities. 10 This 
index number of loans is *i and is positive, with price V! = |t/i|. 

To illustrate the user costs and their magnitudes, summary data on 
demand and time deposits are presented in table 1. For demand 
deposits, interest rates paid are zero until November 1, 1978, the date 
of inception of Negotiable Order of Withdrawal (NOW) accounts in 
Federal Reserve District 2. At the 1973 sample mean, the cost of the 
deposit insurance premium is 0.0347 percent, while service charges 
earned are 0.7785 percent. The reserve requirement tax is 0.5547 
percent. 


Data: Labor, Materials, and Capital 

For labor input, the FCA distinguishes managerial and nonmanage- 
rial employees. A Tornquist quantity index of labor input is con¬ 
structed. 11 The price of labor services is the ratio of the total labor 
compensation to the labor quantity index. 

rate. Klein (1974, p. 935) defines alternatively an implicit rate. The rental price of 
monetary services is equal to the marginal cost ot producing the services. The interest 
rate is that obtained by the bank on the services from assets produced by the deposit. 

^ Discounting rates sufficiently high to classily lime deposits as outputs do not satisfy 
the feasibility condition. Banks hire lime deposits as inputs, analogous to hiring labor, 
to produce output services. 

™ “Loans" include investments, real estate mortgages, installment loans, credit card 
loans, and commercial, agricultural, and other loans on the asset side, and borrowed 
and purchased funds on the liability side. User costs are constructed separately for each 
of these seven, so it is possible to specify a more extensive financial technology. For a 
two-good case, an example of the construction of a Tornquist index is presented in n. 
11 . 

11 A Tornquist quantity index can be constructed as follows, using the two categories 
of labor. For a given bank, let total employment ot managerial and nonmanagenal 
labor at time t be L\, and L. i t . Corresponding wages are u',, = W, , and W’ 2 . ( Total 
compensation paid is C, = i The shares in compensation are and 

their 2-period moving averages z,, = i + w,,)l 2 over t - 1 and t. The growth rate 
of the Tornquist index of labor input is, in discrete time (where A is the first-di( terence 
operator), A In x tJ = | z,,, A In L, ,. This is a weighted average of the employment 

growth rates. The underlying functional form aggregating the two types of labor 
(L IJf Lij) is transteg. The weights are shares in total compensation, anthmeiicallv 
averaged over the two periods. If t — 1, .... 7“ and x V i = * by normalization, x s , 
= exp(XJ_ 2 A In x 5-1 ). The corresponding price of labor V 5 , is C,/x 5( . 
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Material services include stationery, printing and supplies, tele¬ 
phone, telegraph, postage, freight, and delivery. Wholesale and pro¬ 
ducer price index series for these are from the U.S. departments of 
Commerce and Labor, as published in the national accounts issues of 
the Survey of Current Business. The quantity of materials is the ratio of 
total expenditures to a Tornquist materials price index. The price 
and quantity of labor and materials are (V 5 , x 5 ) and (V6> xg), respec¬ 
tively, with x 5 and x fi negative as inputs. 

Real physical capital is an index of structures, computers, and other 
equipment. Stocks are constructed by the perpetual inventory 
method. Each year, constant dollar investment is added to the previ¬ 
ous stock, and depreciation is subtracted. Depreciation rates are ob¬ 
tained from the U.S. Treasury Department Bulletin F series on asset 
lives. Financial capital is the difference between real financial assets 
and liabilities. The sum of this and real physical capital is total capital, 
x K , equal to economic shareholders’ equity. 

III. Specification 

The prices and quantities of the goods are: 

(V!, X|) loans: service price per efficiency unit and efficiency units of 
loans, 

(V a , xj.) cash: service charge per dollar and real dollar balance, 

(V ; i, x 3 ) demand deposits: service price per dollar and real dollar 
balance, 

(V. t , x-i) time deposits: service charge per dollar and real dollar 
balance, 

(Vr,, x s ) labor: wage per efficiency unit and efficiency units of man¬ 
agerial and nonmanagerial labor, 

(V,-,, x ( -,) materials: price per unit of service and efficiency index. 
Capital is fixed at x* and measured positively. Net loan revenue V|X 1 is 
nonnegalive, while the analogous demand deposit term is ^ 2 * 2 - Ex¬ 
penditures on cash and time deposits are F^x* and V 4 X 4 . Variable 
profit is it* = 1 Vpc,. 

The variable profit function is 'tr*(F, x K ), where V = (V,, . . . , V 6 ). 12 
A translog specification for the variable profit function is (where a 
and p with appropriate subscripts denote parameters) 


12 I his is before corporate income lax on this economic as opposed to accounting 
definition in it*. All banks for all years have accounting profit and economic variable 
proht such that the marginal rale of corporate income tax is 48 percent. Since the 
dependent variable in (10) is in logarithms, ln(l - t)h* = In ¥* - t, where t is the 
marginal rate of cdrporate income tax. The average corporate income tax rates are 
almost identical to marginal rates during the period. The tax effect is thus subsumed in 
the intercept of the variable proht function and does not enter supplies or demands. 



868 


JOURNAL OF POLITICAL ECONOMY 


6 6 


In n* = do 4- X a, In V, + a* in x K + ‘/a X X V>J ln V' ln V J 

.-I I-1 ]-1 

JU (9) 

+ 2L P;K ln V ) ln X K + '^Pkit (In X K ) 2 . 

/= 1 


Linear homogeneity in prices implies i a, = 1 and i fJ y = 0, i = 
1.6. Demands for inputs and supplies of output are 


d In rr* 
d In V , 


V* 


VJ, 


= a, + X P« ,n v j + P*a In x K (10) 


for i = 1,.... 5. Here -it = u*/V 6 and v, = V,JV 6 , i ^ 6, with x, positive 
for loans and demand deposits and negative for cash, labor, and time 
deposits. The dependent variable is the relative expenditure on each 
good w ith respect to variable profit. The equations (9) and (10) consti¬ 
tute the estimating system. Excluded is the materials demand, since 
the system is linearly dependent, reducing the number of estimating 
equations to six. Additive disturbances are included, with nonzero 
contemporaneous covariances. Potential sources for the disturbances 
are approximation errors in the specified variable profit function rr, 
errors in profit maximization or in satisfaction of first-order condi¬ 
tions, and differences in nonmeasured components between firms, 
such as entrepreneurial ability and locational rents. 

The observations are a pooled time series and cross section of 18 
banks, with six equations for each. The dependent variables are (ln it, 
7’iXi/w, v^x-j/tr, n-pcn/ir, v.&Jit, /it). The variables ( v, x K ) are nor¬ 
malized at unity at their geometric sample mean. In addition to the 
independent variables of (9) and (10), individual bank dummy vari¬ 
ables are included. The model is estimated with these bank effects 
included and without them. The inclusion of bank effects, the 
“within" method of Mundlak (1978), leads to unbiased estimates with 
pooled time-series and cross-section data. 


IV. Estimation of Financial Firm Technology 

A. Parameter Estimates 

The parameter estimates of the variable profit function both with and 
without bank effects are presented in table 2. A likelihood ratio test 
indicates that the presence of bank effects is not rejected, so subse¬ 
quent testing is applied with these included. 

The financial technology must satisfy the regularity conditions of 
monotonicity and convexity. Monotonicity requires variable profit to 
be increasing in output prices and decreasing in input prices. Given 
that v, > 0 by definition and ft > 0, sign(dir/du,) = sign(^), i = 1. 
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Parameter 

Bank Effects Included 

Bank Effects Excluded 

<*1 

1.3098 

(.0930) 

1.2539 

(.0279) 


-.0697 

(.0157) 

- .0807 

(.0053) 

“5 

.5147 

(.0773) 

.6012 

(.0236) 

ot 4 

-.2146 

(.0406) 

-.1763 

(.0118) 

a* 

-.4331 

(.0904) 

- 4935 

(.0259) 

Pm 

.2701 

(.0730) 

1639 

(.0827) 

Pl2 

.0219 

(.0148) 

- .0222 

(.0207) 


- .3302 

(.0633) 

- .2305 

(.0758) 

014 

.0774 

(.0307) 

.1097 

(.0326) 

Pis 

- .0446 

(.0654) 

-.0412 

(.0673) 

0IA 

-.0146 

(.0601) 

- .0675 

(.0202) 

022 

.0320 

(.0404) 

- .0250 

(.0517) 

023 

-.0420 

(.0393) 

.0382 

(.0485) 

024 

- .0032 

(.0074) 

.0088 

( 0092) 

025 

.0129 

(.0130) 

.0121 

(.0176) 

02 A 

.0040 

(.0104) 

-.0141 

(.0044) 

033 

.4641 

(.0913) 

.3325 

(.1240) 

054 

-.0178 

(.0271) 

- 0433 

(.0299) 

055 

- .0404 

(.0610) 

- .0384 

(.0709) 

03 K 

- .0539 

(.0502) 

.0257 

(.0179) 

044 

-.1047 

(.0144) 

-.1205 

(.0146) 

045 

.0390 

(.0284) 

.042.3 

(.0283) 

04* 

- .0269 

(.0263) 

.0070 

(.0085) 

035 

.0244 

(.0661) 

.0154 

(.0673) 

05 A' 

.0928 

(.0584) 

.0457 

(.0187) 


Note —Asymptotic standard errors in parentheses a elements are firsi*order and (3 elements second-order 
parameters, and symmetric Subscripts are 1 loans, 2 cash, 3 demand deposits. 4 time deposits, 5 labot, K capital 
Given linear homogeneity, prices of variable goods are measured relative to that for materials, good 6 


5, where e, = v.xjir. The constraint 2 ?-1 e, — 1 is used to calculate the 
relative expenditure for materials, and this is negative. Part A of table 
3 reports these relative expenditures. Variable profit increases when 
the prices of loans and demand deposits increase, and decreases when 
the prices of cash, time deposits, labor, and materials increase, for all 
sample points. 

For convexity, the Hessian matrix of the variable profit function 

must be positive semidefinite. For i,j= 1.6 this matrix has the 

typical element 


h 9 


d 2 -ir 
dv, dVj 


e,e. 


+ v f 


de, 

dVj 


dv, \ IT 
' dVj ) v,Vj ’ 


( 11 ) 


For the translog, deJdVj = (3|/u ; , so 

H,j = (e.e, + P,, ~ me,) 


(12) 


where m = 1 if i = j, and 0 otherwise. 
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TABLE 3 

A. Monotonicity Test, Outputs, and Inputs 


Relative 

Expenditure 

1 » t x, /*rr 

Mean 

Standard 

Deviation 

Minimum 

Maximum 

l. oa ns 

1,3098 

.3172 

.888) 

1.7901 

Cash 

- 0697 

.0374 

-.1904 

- .0322 

Demand deposits 

5)47 

.1602 

.2722 

1.0962 

1 ime deposits 

- 2146 

1013 

-.4113 

- .0068 

Lahot 

-.4331 

.1428 

-.9120 

-.2212 

Matet ials 

-.1071 

.0365 

-.1941 

- 0422 


B C.ONVEXiiY Iesi — Hessian Mairix of Variable Profit Function 
(//,,) (at Geometric Sample Mean) 



Loans 

Cash 

Demand 

Deposits 

Time 

Deposits 

Labor 

Loans 

7274 

6039 

.3661 

- 1782 

- 6087 

Cash 

6039 

0926 

- .0886 

.0232 

0409 

Demand deposits 

3661 

- 0886 

.1909 

-.2230 

- .2333 

Tune deposits 

- 1782 

.0232 

- .2230 

.1774 

.0974 

Lahot 

- .6087 

0409 

- 2333 

.0974 

6453 


The //,, at the geometric sample mean are in part B ot table 3. The 
Hessian mairix is singular for the 6x6 case, given linear 
homogeneity of the variable profit function. Accordingly, the row 
and column corresponding to materials are deleted. Along the princi¬ 
pal diagonal, all elements are positive. The principal minors are cal¬ 
culated, starting from the first row and column. The variable profit 
{unction is convex for all observations. 


/( Elasticities of Transformation, Supply, and Demand 

The satisfaction of regularity conditions implies that estimates of 
own- and cross-price elasticities of supply and demand for monetary 
and nonmonetary goods are well behaved. 1 hese elasticities measure 
the responsiveness of the production of monetary goods when inter¬ 
est rates and other components of user costs are altered. 

I'he elasticity of relative quantities of monetary and other goods 
when corresponding relative prices change can be derived. This elas¬ 
ticity of transformation is obtainable for any pair of goods, including 
both inputs and outputs. From the Hessian matrix, the elasticity of 
transformation is 









THE FINANCIAL FIRM 


871 


CT 


*7 


tt(d‘‘ i, nldv,dv l ) . _ 

(dTr/dv^dTt/dVj) *’ ^ 

XJCj 


For the specification, at the geometric sample mean where e. 


(13) 




= 1 + 


m 


a,a, 


—, i,j = 1, .... 6, 


a 


(14) 


and the estimated elasticities of transformation are indicated in table 
4 with asymptotic standard errors in parentheses. 13 The on-diagonal 
estimates are strictly positive. The off-diagonal terms indicate the 
degree of substitutability and complementarity between inputs and 
outputs. Between two inputs or two outputs, if cr, t > 0 they are substi¬ 
tutes, and if cr, ; s 0 they are complements. For one input and one 
output, if <j,j > 0 they are complements, and if <j, } s 0 they are substi¬ 
tutes. The last row of the table indicates relative expenditure a, at the 
sample mean. 

The compensated elasticities of supply and demand are 


■n.; = f V; 

and at the geometric sample mean, for the specification. 


•n., = «, + 



m, i,j= 1, .... 6. 


(15) 

(16) 


These estimates are in table 5. On the principal diagonal are the own 
elasticities of supply and demand. 

The financial technology is relatively inflexible. The own-price elas¬ 
ticity of supply for loans is 0.5160, and that for demand deposits is 
0.4164, both being significantly different from unity at the 1 percent 


13 The elasticities of transformation, supply, and demand are functions of the a and 
P elements. If 5 = (a, p) and the elasticity 0 = X(B), then approximately 


var(o) = 


+ 2 1 1 


Here X is the form specified in (14), with e, = a, and N = 2 for an own elasticity and N 
= 3 for a cross elasticity. The expression is obtained by taking a Taylor series expansion 
and using the second-order terms. The derivatives of X are evaluated at the parame¬ 
ter estimates. This procedure is described for N = 2 in Mood, Graybill, and Boes 
(1974, chap. 5). 
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(eve/. Time deposit demand is relatively inelasttc, with a corre¬ 
sponding estimate of -0.7265, also significantly less than negative 
unity at the J percent level. While the own-price elasticity of demand 
for cash is - 1.5281, it is not significantly different from negative 
unity at the 1 percent level. 

Among physical inputs, the demand for labor is relatively elastic, 
with the own-price elasticity — 1.4895. For materials, the own-price 
elastiritv of demand is - 1.4049, but this is not significantly different 
from zero at the 1 percent level. Off the principal diagonal are the 
cross-price elasticities of supply and demand. Each column represents 
the percentage change in quantity of a given good with unit percent¬ 
age price changes for each of the six goods. In the first column, loan 
supply is decreasing in the price of all inputs. All cross-price elas¬ 
tic (ties of loan supply are less than unity, with the largest absolute 
value for labor. The cross-price elasticities are less than unity in abso¬ 
lute value except for three. These are the demands for labor and 
materials when the price of loans changes and the demand for cash 
with respect to the price of demand deposits. 

The results have two implications for the monetary transmission 
process. The first is on the definition of a monetary index. The money 
supply is typically published as an index of perfectly one-for-one 
substitutable monetary components and derived from a money multi¬ 
plier independent of interest rates . 14 The simple sum requires infinite 
elasticities of transformation and unit relative user costs. Of the six 
pairs of cross-price elasticities between cash and demand and time 
deposits, only that between cash and demand deposits at 1.1167 ex¬ 
ceeds unity in absolute value. The remaining five are all significantly 
less than unity in absolute value at the 1 percent level. Monetary 
goods have low substitutability, so the simple sum form is subject to 
bias. Nonunitary relative user costs are required in the construction of 
a monetary index from the production side. Given noninfinite substi¬ 
tutability, differences can arise from the allocation of monetary ex¬ 
pansion between cash and demand and time deposits. 

The second implication for the monetary transmission process is 
the efiect of interest rate changes on output and employment. For the 
sample hanks, production is relatively insensitive to interest rates. To 


11 An exception to the assumption of independence of the money multiplier to 
mteiesi rates is in Teigen (1964) Interest rates enter (he money mulupher through 
profit-maximizing conditions in the commercial banking sector. This provides a mecha¬ 
nism toi applying the empirical results to aggregate money supply determination. 
Grossman and Weiss (1983) propose a theory of the monetary transmission mechanism 
where an increase in the money supply can affect interest rates in the short run it 
transactions at banks have indivisible characteristics. See also Sargent and Wallace 
(1982). 
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tiveness of monetary policy can be muted by inflexibility in the pro¬ 
duction process of monetary goods and services. 


V. Monetary and Regulatory Policy and Bank 
Behavior 

A. Introduction 

While banks are relatively responsive to prices for nonfinancial goods 
such as materials and labor, they are less so to interest rates or compo¬ 
nents of the prices of financial goods. The behavior of banks regard¬ 
ing three monetary and regulatory policy changes is investigated. The 
first is an increase in the interest rate paid on time deposits of 0.5 
percent. This permits examination of monetary policy acting through 
interest rates or increases in deposit rates payable arising through 
depository institution deregulation. 15 The second removes reserve 
requirements on demand deposits. The third eliminates the rebate 
provision of the FDIC, whereby the gross premium paid on deposits 
is reduced, but with no change in insurance coverage. The policies are 
evaluated at the 1978 sample mean. 

B. Interest Rate Effects 

The user cost of time deposits is, with P = 1 in 1978, 

i, _ r “ “ ~ *•») ~ *4 + d 4 

u 4 - rm -• (17) 

where r 4 is the interest rate paid on time deposits, R the discounting 
rate, k 4 the reserve requirement rate, s 4 service charge revenue, and rf 4 
the deposit insurance premium rate. Since df/ 4 /dr 4 = 1/(1 + R), an 
increase of 0.5 percent in time deposit interest rates raises user costs 
and prices by 0.5/(l + R) percent. The mean interest rate paid on 
such deposits in 1978 is 5.037 percent from table 1. 

From table 5, the own-price elasticity of demand for time deposits is 

— 0.7625. A 0.5 percentage point increase in time deposit interest 
rates at R = 0.05 increases the own price by 9.4 percent and reduces 
the bank demand for time deposits by 7.2 percent. Within a 95 per¬ 
cent confidence interval, the own-price elasticity is bounded below by 

— 0.9730. Using this estimate, the reduction is 9.1 percent. Hence 

15 Barro and Saittomero (1972) have shown that interest rate ceilings in the regulated 
period are binding, and Santomero and Seigel (1981) have examined the aggregate 
effects of this. 
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monetary production is relatively inelastic, but there remains some 
response to price changes. 

One rationale for interest rate restrictions is that banks pass on 
lower rates as reduced costs to borrowers. Apart from the efficiency 
and equity aspects of this transfer, it is possible to examine the degree 
to which this takes place. The cross-price elasticity of loan supply with 
respect to time deposit prices is —0.1555. A 9.4 percent increase in 
such prices reduces loan quantities by 1.5 percent. Within a 95 per¬ 
cent confidence interval, the reduction is bounded above in absolute 
value by 2.3 percent, or less than one quarter of the percentage 
change in prices. 

On deposit switching, the evidence is also of sluggish response. As 
interest rates rise on time deposits, holders shift from demand to time 
deposits. This increases the average cost of deposits to the bank. The 
elasticity of demand deposit supply with respect to the price of time 
deposits is — 0.2491. The decline in the volume of demand deposits is 
2.3 percent from a 0.5 percentage point time deposit interest rate 
increase, or a 9.4 percent increase in price. A 95 percent confidence 
interval contains a 1.1—3.5 percent decline in demand deposit ser¬ 
vices. 


Resen>e Requirements 

Reserve requirements enter the user costs of deposits explicitly. The 
effects of these taxes have been noted by Fama (1980). The user cost 
of demand deposits is, for P = 1 in 1978, 


U s = 


rs - R (1 - k 3 ) - i 3 + 
1 + R 


(18) 


If the reserve requirement tax is eliminated, then Rk s = 0. As an 
opposite polar case, the Fisher (1935) "100 percent money” policy 
implies & 3 = 1 and Rk$ = R, The mean reserve requirement tax Rk$ is 
0.475 percent. The flow effect on demand deposit user costs of 
changing the reserve requirement is dU$ldk$ = R/( 1 + R). If Ag = 0, 
the effect on U 3 is to eliminate the f?Ag/(l + R ) term. At R = 0.05, this 
amounts to a 0.452 percentage point reduction. This decreases the 
user cost, or increases the price of demand deposits, to — 4.52 percent 
from the 1978 sample means of -4.07 percent. The price received 
for demand deposit services is 11.1 percent higher if reserve require¬ 
ments are removed. 

The own-price elasticity of supply for demand deposits is 0.4164. A 
95 percent confidence interval is bounded above by 0.7823, indicating 
relative inelasticity in supply of this monetary good. Eliminating re¬ 
serve requirements increases the demand deposit quantity by 4.6 per- 
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cent from the 11.1 percent price increase. Banks are more willing to 
supply demand deposits if the taxes imposed by reserve requirements 
are eliminated, Reserve requirements impose costs on bank opera¬ 
tions and reduce output and employment. This conclusion rests on 
the assumption that reserve requirements are pure taxes. 

Conclusions for policy changes depend on the weights assigned to 
employment and bank efficiency and on the benefits of the regula¬ 
tions. The interest rate increase on time deposits and removal of 
reserve requirements on demand deposits each change user costs for 
the respective monetary goods by approximately 0.5 percentage 
points. The former has a larger own effect on variable profit, and 
ultimately on output and employment, since time deposits are more 
flexibly used in production than demand deposits. 

D. Deposit Insurance 

Quantifiable costs of deposit insurance are readily available only for 
the mandatory premium paid to the FDIC. Indirect regulation in 
solvency supervision, asset and liability coverage, and inspection may 
also obtain (see Buser, Chen, and Kane 1981), The regulations during 
1973-78 provide that a premium of V 12 of 1 percent of deposits be 
levied. This is rebated to an effective premium of '/so of 1 percent. If 
no rebate is made, the premium increases by a multiplier of so /i 2 , or 
‘2.5. The effect of eliminating this rebate is examined. 

For demand deposits, the average FDIC premium rate is 0.0392 
percent in 1978. Applying this 2.5 multiplier factor on nonrebated 
premia, the rate becomes 0.098 percent. Using the sample mean 
values of 1978 in table 1, in percentage terms, r 3 = 0.0006, b 3 = 
0.0392, Rk 3 = 0.4750, and s 3 = 0.7167. Now dU 3 /db 3 = 1/(1 + R),and 
the change in FDIC premium is 0.059 percentage points. Further, 
0.059/(1 + R), the change in the user cost of demand deposits, is 
0.056 percent at R = 0.05. From the mean demand deposit user cost 
of —4.07 percent, the FDIC rebate abolition changes the net cost to 
— 4.01, or a 1.5 percent reduction. 

The elimination of the premium rebate increases demand deposits 
by 0.6 percent, bounded above in a 95 percent confidence interval by 
1.2 percent, from the 1.5 percent price reduction. These are the 
direct estimates. If the perception is that higher premium rates indi¬ 
cate safer banks, deposits may increase further. 


VI. Concluding Remarks 

Underlying the aggregate demand for monetary goods is the behav¬ 
ior of individual firms and households. The degree of flexibility in the 
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demand for money indicates the extent to which monetary policy is 
ef fective. Fundamental to the aggregate supply of money is the pro¬ 
duction behavior of financial firms. The flexibility of financial pro¬ 
duction at the individual firm level determines how loans and mone¬ 
tary quantities respond to interest rates and other regulatory policy 
variables. 

An estimable model of the microeconomics of monetary produc¬ 
tion has been derived. The estimates satisfy the properties of convex¬ 
ity and monotonicity, required for monetary and nonmonetary sup¬ 
ply and demand elasticities to be well behaved. A full transformation 
matrix between outputs and inputs is obtained from the Hessian ma¬ 
trix of the variable profit function. The duality between transforma¬ 
tion and variable profit functions is applied to monetary and financial 
production. Among the financial goods, supplies of outputs and de¬ 
mands for inputs are relatively inelastic, and there is little substitut¬ 
ability between them. There is relatively flexible usage of physical 
goods, labor, and materials. 

On monetary policy, the short-run response in deposit switching or 
in loan supply from increases in interest rates is small. The taxes 
arising from reserve requirements are calculated as a flow per period. 
The reduction or elimination of these taxes does not have large ef¬ 
fects because of the low own-price elasticities of demand and supply 
in absolute value for monetary goods. 

While elasticities are low in absolute value, the benefits from enact¬ 
ing changes can be large, depending on policy weights. What is more 
relevant is that financial firms are not the only producers using cash, 
demand deposits, and loans in production. The use of these services is 
virtually universal. 

In the construction of prices for financial services, the first mo¬ 
ments of the distributions of interest rates have been used. A more 
general specification could involve higher-order moments of these 
distributions. The financial firm maximizes a function based on the 
utility of variable profits, and the response of monetary production 
depends on the attitude toward risk and the specification of the utility 
function. 

If monetary goods are to be excluded in estimating production 
technologies, the necessary separability conditions from physical 
goods and nonmonetary financial goods must be tested. If these con¬ 
ditions fail, monetary goods should be included as either inputs or 
outputs in a more general transformation function. A complete anal¬ 
ysis of monetary theory and policy requires application of these mi¬ 
croeconomic models to other financial firms and, ultimately, if aggre¬ 
gation conditions obtain, to data on the economy as a whole. 
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Nutrition and the Demand for Tastes 


Eugene Silberberg 

University of Washington 


The law of diminishing marginal product, applied to food nutrients, 
implies that as consumers’ incomes increase, a smaller fraction ol 
their food budget will be devoted to pure nutrition. I test this result 
using data from the Nationwide Food Consumption Survey, 1977- 
78. The foods consumed by five income groups are observed, and 
the amounts of 18 nutrients actually consumed are calculated. With 
those observed nutrient levels as constraints, the linear program¬ 
ming diet problem calculates the diet providing those nutrient levels 
at least cost. The ratio of actual cost to least-cost expenditure on 
nutrition is shown to increase with income. 


I. Introduction 

In 1945 George Stigler formulated and gave an approximate solution 
to the now famous “diet problem,” which seeks the minimum cost of 
achieving the recommended daily allowances of nutrients known to 
be beneficial to humans. That diet, not surprisingly, comprised foods 
most people find unappetizing: pork liver, spinach, cabbage, dried 
beans, evaporated milk, and wheal flour. Stigler’s menu pointedly 
demonstrated the difference between “technical efficiency” and eco¬ 
nomic efficiency, where preferences count. In this paper I develop a 
hypothesis about the exercise of tastes by humans with regard to food 
consumption, using the law of diminishing marginal product. This 


I am indebted to Yoram Barzel, Thomas Borcherding. and Keith Letfler lor helpful 
comments and to Mary Hama of the Food Consumption Research Group (USDA) for 
providing additional information on the data. The paper would not have been possible 
without John R. Nelson’s diligent research and data processing and Kirk Glerum's 
assistance in programming. 1 am indebted to the Sloan Foundation lor a summer grant 
in support of this research. All remaining errors are of course my own. 

(Journal of Political Economy, 1985. vol 95, no, 5J 

© 1985 by The University ot Chicago All rights reserved 0022-3808/85/9305-0007S01 50 
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theory is then tested with data from the Nationwide Food Consump¬ 
tion Survey, 1977-78, published by the U.S. Department of Agricul¬ 
ture (USDA). Specifically, I show that economic postulates predict 
and the data confirm that as income rises, expenditure on pure “nu¬ 
trition” falls as a percentage of overall expenditure on food. 

II. The Model 

fhe choice of foods consumed has in general been unexplained by 
economists; these decisions have been relegated to the domain of 
“tastes,” to make an inevitable pun. If foods are treated as final goods 
in consumption, then convexity of indifference curves provides no 
prediction regarding inferiority or normality of foods. Following 
Becker (1965) and Lancaster (1966), 1 treat foods purchased in the 
market as inputs in the production of meals, which provide both 
pleasing taste and nutrition. Various vitamins and minerals are by 
now well known to be useful to humans in maintaining health. As 
Stagier noted, however, these nutrients, like all productive inputs, are 
subject to the law of diminishing returns. Advocates of megadoses 
notwithstanding, the prevailing scientific opinion seems to be that 
these nutrients are highly useful to humans in some (generally small) 
amounts, that larger amounts provide marginally less in the way of 
additional benefits, and that in some cases, for example, nonsoluble 
vitamins, very large intakes are even toxic. Thus we can assume that 
nutrients are useful to humans in the classical Knightian sense, with 
three stages of production. 

Since the marginal products of nutrient elements fall as consump¬ 
tion ol these inputs increases, w-e should expect a corresponding fall 
in consumers’ marginal values of vitamins and minerals. More specifi¬ 
cally, as nutrient intakes increase, the marginal values of tastiness, 
aroma, texture, and so forth, that is, the qualities of goods relating to 
palatability, should rise relative to the marginal values of inputs to 
pure nutrition. We should therefore expect that, as income rises, 
consumers will spend a decreasing fraction of their food budget on 
pure nourishment. As income increases, consumers will defer rela¬ 
tively more toward the pleasurable aspects of eating and relatively less 
toward the production of nourishment. 

III. Empirical Tests 

A. General Procedure 

The theory outlined above is tested as follows. Letting = amount of 
food; actually purchased by individuals in a given income group, 1 the 


An additional subscript for income class is omitted to avoid notational clutter 
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total amount b, of nutrient i consumed by individuals in that income 
class is b, = a^x p where a v = the amount of nutrient i in one unit of 
food j. The combination of foods providing consumers with at least 
those levels of the nutrients and at least cost is obtained by solving the 
linear programming diet problem (LPDP), 

minimize y = p'x 

subject to Ax ^ b 
x 5= 0, 

where the column vector b consists of the nutrient levels b, actually 
achieved by these consumers. Let x* denote the food vector that solves 
the LPDP and lety* = p'x*, the resulting minimum possible expendi¬ 
ture on those nutrients. If y = actual expenditure on these foods, the 
model predicts that the ratio yJy* will increase with increasing income. 

B. Data 

l he USDA’s Consumer Nutrition Center surveyed food consump¬ 
tion in approximately 15,000 households in 1977 and 1978. Most 
important for this study, the data were broken into five income 
groups, ranging from under $5,000 to over $20,000 per household. 
Altogether, several hundred separate foods appear to have been 
tabulated, though many small categories are ultimately aggregated. 
For example, wheat flour includes white, whole wheat, buckwheat, 
cake meal, gluten, and low-protein flour. Close to three hundred 
separate food categories were ultimately reported, listing data on 
physical quantities (pounds per week} and total expenditure per week 
for each of five income groups plus a total for all households com¬ 
bined. In the analysis presented below, a total of 271 separate foods 
were considered. Eliminated were foods with such small entries that 
rounding error appeared significant. 2 A complete list of the foods 
analyzed is given in the Appendix. 

Five household income categories are delineated: under $5,000, 
$5,000—$9,999, $10,000-$ 14,999, $15,000-$ 19,999, and $20,000 
and over. The actual mean incomes for these groups are $3,032, 
$7,223, $ 12,023, $16,934, and $29,290, respectively. 3 These incomes, 
however, do not include in-kind transfers such as food stamps, wel¬ 
fare, social security and medicare payments, and the like. The lowest 
income group, in particular, has actual income above these reported 


* Examples were cheese dips, in which certain entries were missing, unsalted nuts 
other than peanuts, and like categories. Also eliminated by aggregation were, e g., 
subcategories of fruit ade, drink punch, and nectar, e g., powdered, frozen, etc. The 
aggregated quantities and expenditures were deemed more reliable in these instances. 
3 l am indebted to Mary Hama for these figures. 
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figures. Unfortunately, the households differ in terms of size. House¬ 
hold size was defined to be the number of meals served to all persons 
in the same week, divided by 21. Thus meals served to guests and 
boarders were included. Arbitrary weights were given to snacks; for 
example, coffee and doughnuts served to a neighbor were counted as 
one-fourth of a meal. Not surprisingly, household size generally in¬ 
creased with income in the sample. The average household sizes for 
the five income groups listed above are, respectively, 1.91, 2.53, 2.91, 
3.22, and 3.21. Happily, the last two income categories have essen¬ 
tially the same size and therefore permit more direct comparisons 
with each other than with the other households. Using these incomes 
and household sizes, the per capita incomes, in 1977 dollars, of these 
individuals are $1,587 + , $2,855, $4,132, $5,259, and $9,125, respec¬ 
tively. The possible effects of differential family size on the empirical 
results will be discussed later. 

Prices for each food were derived by dividing total expenditure on 
each fix)d by the quantity (pounds) consumed. A comparison was 
made of the prices derived for each income group. In general there 
were insignificant differences, much of it due to rounding error, in 
the prices paid by each income group. The only possible exceptions 
were that the highest income group sometimes paid slightly higher 
prices than the other groups. It is not possible to say how many of 
these differences, if any, were due to the possibility that higher in¬ 
come groups tend to purchase higher-quality food or to make pur- 
t liases in supermarkets providing better service. The differences ap¬ 
peared quite small, however, and thus food prices were defined by the 
ratio of all households’ expenditures to the quantity consumed by all 
households for each good analyzed. 

The Food Consumption Survey ultimately provided a 271 x 5 
matrix of foods actually consumed by American households in 1977— 
78, one column for each income class. The columns of this matrix 
were divided by household size, producing a 271 X 5 matrix, denoted 
x. representing the actual foods consumed per person for each in¬ 
come group. Also calculated from this data source was the 1 x 271 
matrix (vector) of observed food prices, p'. The actual total weekly 
expenditures on food per person, p'x = y, arey = ($14,859, $14,843, 
$15,007, $15,613, $17,215). Even though household income rises, 
food spent per person actually declines from the first to the second 
group (though this does not appear significant) and rises only slightly 
between the second and third income groups. However, increased 
household income and higher food expenditures clearly occurred for 
the two highest groups, both of which had higher per capita incomes 
than the second and third group. 

The next step in the procedure was the construction of the A ma- 



DEMAND FOR TASTES 


885 

trix of nutrient coefficients, using the nutrients commonly listed in 
tables of Recommended Daily Allowances (RDAs). The complete list 
of nutrients is as follows: (1) protein, (2) vitamin A, (3) vitamin D, (4) 
vitamin C, (5) folacin, (6) niacin, (7) riboflavin (vitamin B 2 ), (8) 
thiamine (vitamin B,), (9) vitamin B 6 , (10) vitamin B I2 , (11) calcium, 
(12) phosphorus, (13) iron, (14) magnesium, (15) zinc, (16) food en¬ 
ergy, (17) lipids (total), and (18) carbohydrates. 4 Lipids (fats) and 
carbohydrates are not listed in tables of RDAs but are important 
nutritional elements analyzed by the USD A and other agencies. 0 

In all, an 18 x 271 matrix A of nutrient coefficients was con¬ 
structed. Most of the nutritional information came from publications 
by the USDA; the data on various minerals were obtained from spe¬ 
cific articles on those nutrients (see Hardinge and Crooks 1961; Orr 
1969; Hankin, Margen, and Goldsmith 1970; Murphy, Watt, and 
Rizek 1973; Pertoff and Butrum 1977; Greger, Marhefka, and Geis- 
sler 1978). The principal source was Agricultural Handbook no. 8, 
Composition of Foods (1981), a series of detailed listings of nutrients of 
various foods. This series is at present incomplete; when data were 
missing, the older handbook, published in 1963, was used. 


C. Empirical Results 

The per capita levels of the 18 nutrients actually consumed by the five 
household groups are displayed in table 1. Also displayed for com¬ 
parison are the RDAs of these nutrients, as published by the U S. 
Food and Nutrition Board of the National Academy of Sciences 
(1980). 6 

Certain features of these nutrient levels should be noted. For most 
nutrients but not all, specifically folacin and magnesium, the RDAs 
were easily exceeded for each income group. Indeed, the RDAs of 
vitamins A and D were exceeded by factors of 10 or more. Consump¬ 
tion of protein, vitamin C, niacin, and phosphorus was approximately 


■* Iodine was commonly listed in tables of RDAs, but this nutrient is not contained in 
the foods studied. It is obtained in sufficient quantities from the use ol iodized salt and 
was therefore left out of this analysis. Vitamin E was not included because its composi¬ 
tion in foods has not yet been analyzed. Vitamin K is produced by the body itself, and 
deficiency is apparently extremely rare. 

5 Lipids include all animal fats and vegetable oils. Aside from providing concentrated 
amounts of food energy, certain lipids are essential parts of the structure of cell mem¬ 
branes. They also provide solvents for certain vitamins, e.g., A and D, which are 
insoluble in water. Carbohydrates form the other major food group, in addition to 
proteins. Carbohydrates comprise sugars and starches and other chemically similar 
nutrients and are used by the body for energy 

6 Note that these are recommended, not minimum daily requirements. The RDAs vary 
by sex and age; the ones presented here are for males, aged 23-50, weighing 154 
pounds and 70 inches in height. There are no RDAs for carbohydrates and lipids 
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TABLE 1 

Nim rifn i Levels Achieved bv Income Groups 1-5, Based on Actual 
Expenditures and RDAs 


1 2 


Pi olein (g) 

7.274 

7.040 

Vitamin A (IT) 

6 167 

5.862 

Vitamin D (p.g) 

8 957 

8.751 

Vitamin C (mg) 

8 863 

8 810 

Folacin (Mg) 

21 723 

20.892 

Niacin (mg) 

20 504 

20.148 

Rifxjflavin (mg) 

16.697 

16 187 

Thiamine (mg) 

13 677 

13.009 

Vitamin B fI (mg) 

) 7.870 

17 393 

Vnamin H iy (p.g) 

22.926 

22 545 

('alemm (nig) 

6 571 

6 230 

Phosphorus (mg) 

1 1.837 

1 1 444 

Iron (mg) 

13 836 

13.543 

Magnesium (mg) 

21 934 

21.955 

Zinc (mg) 

5.293 

4.714 

Food energy (kcal) 

19.425 

19.108 

l ipids (g) 

9.926 

9.883 

C.arbuhvdrates (g) 

19 912 

19.737 


3 

4 

5 

RDA 

7.001 

7.011 

7 568 

3.920 

5.26 

4 897 

5.517 

.700 

9.528 

9 775 

11.257 

.350 

8 369 

8 490 

9 098 

4.200 

20 116 

18 500 

21 155 

28.000 

19.917 

19 931 

21.778 

1 2 600 

16.421 

16.041 

17.61 1 

11 200 

12 834 

12.034 

12.757 

9.800 

16 567 

16.252 

17 721 

15.400 

23 002 

23 460 

26 262 

21 000 

6.532 

6.526 

7.418 

5 600 

11.486 

11 418 

12.394 

5 600 

13.633 

13 073 

13.945 

7.000 

20.670 

19 970 

21 918 

24 500 

4 773 

4.834 

4.587 

1.050 

19.241 

19.222 

20 153 

18.900 

9.838 

10 020 

10 442 

0 

21) 193 

19 178 

20 295 

0 


double their RDAs; consumption of zinc, four to five times the RDA. 
Hie remaining nutrients were in general consumed at levels 10-50 
percent greater than their respective RDAs. What these data reveal is 
that even consumers in the lowest income group were, on average, 
well beyond the danger of undernourishment in any nutrient, al¬ 
though this obviously does not rule out malnourishment for given 
individuals. Inspection of table 1 reveals that at the intermediate 
levels of income, there is no apparent pattern for “normality” or 
inferiority of these nutrients, though all, with the exception of zinc, 
appear normal at the higher income levels. It therefore seems safe to 
conclude that, on average, these consumers were well into stage 2 of 
the production function for nutrition (indeed very close to the bound¬ 
ary of stage 3) and that the marginal products of these nutrients in 
terms of improvement of health would all be quite small. We would 
therefore expect the main response to an increase in income to be a 
change in the mix of foods purchased in the market so as to produce 
nutrition in a more enjoyable manner. 

A puzzling f eature of the data is that the consumption of nutrients 
by the lowest income group is higher, for every nutrient, than the 
level consumed by the next higher income group. The reason for this 
anomaly may be that this low-income group consists disproportion¬ 
ately of adults, in particular, elderly people. 7 Older people may place 


7 1 am indebted to Tom Borcherdtng for suggesting this explanation and to Mary 
Hama for confirming it. 
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higher marginal values on nutrition than younger people, for given 
incomes. The ability of the human body to overcome poor nutrition is 
greater in young people, and with positive real interest rates, the 
value of prolonged life would be greater for older rather than young¬ 
er people. 8 As a result of this feature of the data, the linear program¬ 
ming solution yields a lower expenditure for group 2 than group 1. 
The actual expenditure on food for these two groups is essentially 
identical—less than 2 cents out of nearly $15 per week, a difference 
possibly attributable to rounding error. 

The linear programming (LP) solutions to the diet problems, where 
b, = the levels of nutrient i actually achieved by each income group, 
are presented in part A of table 2. 9 Also displayed is the solution to 
the traditional diet problem, where the b ,'s are the RDAs of each 
nutrient. 10 Summary statistics are presented below the commodity 
levels: the actual per capita expenditure on food,y, the LP minimum, 
y*, and the ratio y/y* for each income group. Part B of table 2 presents 
the results when nutrients 17 and 18, lipids and carbohydrates, for 
which no RDAs are specified, are deleted from the LP constraints. As 
an additional test of the robustness of the theory, the LP problem was 
solved using as binding constraints only those for the first four nutri¬ 
ents, protein and vitamins A, D, and C. These results are displayed in 
part C of table 2. The choice of these particular nutrients was arbi¬ 
trary; the purpose in doing this experiment was only to see if the 
results held when only a small set of nutrients was used instead of a 
larger one. 

The results in these tables confirm the theory outlined above. In 
every case, as income increases, the ratio y/y* increases, indicating that 
the actual pattern of food consumption deviates further from the 
technically efficient consumption of nutrients. The effect is most pro¬ 
nounced in the comparison of income groups 4 and 5, for which 
household sizes were the same. 

It is reasonable to conclude that the results would be stronger if 
household size were constant for each income group. It seems plausi¬ 
ble that there are economies of scale in food preparation. Purchasing 
food for two persons is likely to result in less waste than preparing 
food for one person, and so forth as a household size increases. Addi¬ 
tionally, purchases made in small quantities are frequently higher 
priced (per unit) than purchases in larger amounts. In these calcula¬ 
tions, price was taken as the same for all households. If any bias is 


" However, nutrition may also be preventive maintenance, whose value is larger the 
longer the horizon. 

** These and all other solutions to the LP problems utilized the undo program writ¬ 
ten by Linus Schrage for the VAX computer. 

10 The foods denoted “wheat cereal" and “oat cereal” are in fact market composites 
of cold cereals best known by the brand names Wheaties and Cheerios. respectively 



TABLE 2 

Food Consumption peh Week (Pounds) 


Food 

RDA 

I 

2 

3 

4 

5 


A. 

Various Income Classes 




Low-fat milk 

.199 

5.084 

4.967 

5.401 

5.548 

6.389 

Vegetable oil 


1.813 

1.819 

1.807 

1.8S6 

1.903 

W heal Hour 

! 1 418 

9.494 

9.205 

9.029 

9.053 

9.725 

Wheal tereal 

.143 

.587 

.525 

.628 

.383 

.412 

Om teieal 

704 

.054 

.102 


.230 

.260 

Pork rhops 


.252 

.218 

.223 

.225 

.204 

Mustard greens 

792 

2 294 

2 297 

2 133 

2.219 

2.361 

Baking powdei 

095 

Oil 

.004 

.009 

.008 

.015 

Computed y* ($) 
Computed v 

3 145 

4.979 

4.846 

4.850 

4.898 

5.257 

($/indivniual) 


14.859 

14 843 

15.007 

15.613 

17.215 

v/v* 


2.984 

3.063 

3.094 

3.188 

3.275 

V*/M. 145 


1.583 

1.541 

1.542 

1.557 

1.757 

B 

Nutrients 1 

-16 (Excludes Lipids. Carbohydrates) 


Low-tal milk 

199 

5 084 

4.967 

5 407 

5.548 

6.389 

Vegetable oil 







Wheat Hour 

11 418 

1 1.494 

10.205 

10.924 

10.053 

11.305 

Wheat cereal 

143 

.303 

216 

235 

047 

.121 

Oat cereal 

704 

.311 

383 

.354 

.534 

.524 

Pot k chops 


.241 

206 

.209 

.212 

.193 

Mustard greens 

792 

2.360 

2.367 

2.222 

2.298 

2.429 

Baking powder 

095 

.0004 


.001 


.08 

Computed y* ($) 
Computed y 

3.145 

4 277 

4.167 

4.206 

4.230 

4515 

((/individual) 


14.859 

14.843 

15.007 

15.613 

17.215 

V/v* 


3.474 

3.562 

3.568 

3 691 

3 813 

c 

Nutrients l- 

4 Only Protein. Vitamins A, D, and C 


l.ow-tai milk 

.199 

5.084 

4.967 

5.407 

5.548 

6.389 

Vegetable oil 







Wheat Hour 

0.237 

10.348 

9.991 

9 839 

9.816 

10.499 

Wheal cereal 







Oat cereal 







Pork chops 







Mustard greens 

1.361 

2.807 

2 791 

2.642 

2.679 

2.865 

Baking powder 







Computed y* ($) 

{lorn pitted y 

1.495 

3.378 

3.291 

3.297 

3.324 

3.628 

((/individual) 


14.859 

14 843 

15.007 

15.613 

17.215 

y/y* 


4.399 

4 510 

4.552 

4.697 

4.745 
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present, the smaller units, which are the lower-income groups in these 
data, pay more than reported. Thus on both accounts, the figures for 
y (actual expenditures) are if anything too high for the low-income 
groups and too low for the high-income groups. Correcting for such 
bias would have the effect of increasing the rate of change of yly* as 
income increased. It is also likely that higher-income families are 
more highly educated and would therefore more likely have better 
information about nutrition than the lower-income families. This, 
too, would tend to reduce yly* as income increases. In fact this ratio 
increases with income even in the presence of factors that might cause 
the ratio to go the other way. 

I'he LP solutions that emerged for the three problems are unusual 
in that far fewer foods (approximately one-half) appeared in these 
solution matrices than would be expected. A fundamental theorem of 
linear programming states that, in an optimal solution, the number of 
nonzero variables will be no greater than the number of linearly inde¬ 
pendent constraints. Thus there might have been 18 foods appearing 
in part A of table 2, 16 in part B, and 4 in part C. In fact, there were 
only 8, 7, and 3, respectively. The foods utilized in the least-cost 
solution for the most part were nutritious (and inexpensive) in several 
nutrients. Interestingly, the same foods were used in the solutions to 
the three problems, with certain foods dropping out as constraints are 
deleted. 

These solutions are strikingly similar to those presented by Stigler 
40 years ago. The matrix of technical coefficients for the foods ap¬ 
pearing in the LP solutions is given in table 3, along with the prices of 
those foods. When we take the households of group 4 as most closely 
representative of the median U.S. family in terms of size and income, 
the percentages of each nutrient total accounted for by consuming 
the goods in the LP solution are displayed in table 4, parts A—C, 
corresponding to the solutions (for group 4 households) presented in 
table 2, parts A—C. Also, part D of table 4 gives this same information 
for the traditional diet problem where the RDAs are used as the 
constraints. 

In table 4, parts A—C, the top entry in each cell represents the 
actual amount of nutrient 1 provided by f ood j in the linear program¬ 
ming solution, a,yxf. The lower entry represents the fraction of the 
constraint level of nutrient i that is provided by food j, that is, (a,pc*)lb,. 
These constraint levels, the b,' s, are the amounts of nutrient 1 con¬ 
sumed in the actual diets by families in the $15,000-119,999 range; 
these were the levels used in the LPDP for this income class. In the 
column headed bf are the row sums of the fractions of nutrient 1 
provided by each food, that is, bf (a,jX^)lb,. In part A, in which all 
18 nutrients are used as constraints, necessarily, b* =* 1 ,1 = 1 . 18, 
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Vitamin A .018 . . .339 1.163 ... 1.760 ... .700 4.685 

.026 484 1 661 2.514 

Vitamin D .351 ... ... ... ... ... .350 1.003 
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whereas in part B, b? ^ 1, t = 1,.... 16, and in part C, bf 3* 1 , i = 1, 
. . . , 4. (In some cases b? — 0.999 because of rounding error.) 

Fables 2, part C, and 4, part C, are interesting in terms of the 
nutritional properties of low-fat milk, wheat flour, and mustard 
greens. These three foods appear to provide quite a healthy diet. 
Although chosen to satisfy only the protein and vitamins A, D, and C 
constraints, they approximately satisfy the constraints for most other 
nutrients; only vitamin B 12 is deficient in terms of its RDA. 

In all the LP solutions, there were some foods whose inclusion 
would not substantially alter the minimum expenditure levels. Skim 
and whole milk were very close substitutes for low-fat milk, and vari¬ 
ous green vegetables—particularly broccoli, collards, and cabbage— 
could be substituted for mustard greens without dramatically affect¬ 
ing the cost of the diet. (Stigler’s solutions involved cabbage and 
spinach.) These foods all had indirect costs (imputed values) only 
slightly in excess of their actual costs . 11 Other foods for which this is 
true are orange juice, grapefruit, and white potatoes. Only a small 
amount of meat, in the form of pork chops, appeared in the LP 
solutions, and then only when the two large sets of nutrients (but not 
when the actual RDAs) were used as constraints. Wheat flour pro¬ 
vides the bulk of protein in these diets . 12 

D. I rife nor Goods 

Whereas certain final goods in consumption may be inferior, it is 
dif ficult to imagine why increased income would reduce the demand 
for nutrition. With the exception of movement from income class 1 to 
class 2 , which may be a measurement problem, the nutrients studied 
are in fact virtually all normal, at all income levels, though the income 
elasticities appear to be very low . 13 Although these nutrients are nor¬ 
mal, consumers choose to arrange for the consumption of these nutri¬ 
ents differently as income increases. In fact, though inferiority of 
various foods is apparent in the data, there are surprisingly few in¬ 
stances of dramatic inferiority. The only foods that exhibit marked 
inferiority are processed milk, hot oat cereals (though oatmeal is in¬ 
ferior at the lower incomes while hot wheat cereal is inferior at higher 
incomes), cornmeal, stewing beef (but not at the highest income level), 
liver, bacon, and, among vegetables, collards, mustard greens, and 


1 ' The undo LP program provides these indirect costs as part of the output. The 
toods listed above had indirect costs of less than 10c per pound. 

No distinction of protein by amino acid composition has been made in this paper. 
Such a consideration might increase the utilization of meat in the LP solutions. 

n One would expect to find larger income elasticities for consumers with incomes 
lower than that represented by the lowest income group in these data. 
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Average 

Income 

(1977) 

Potatoes 

Hamburger 

Bread 

(Total) 

Margarine 

Butler 

$1,587 + 

1.13 

,60 

.97 

.24 

.05 

$2,855 

1.21 

.64 

.94 

.24 

.05 

$4,132 

1.15 

.70 

1.05 

22 

.05 

$5,259 

1.07 

.70 

.88 

.23 

.06 

$9,125 

1.10 

.73 

.98 

.23 

.08 


dried beans. Strikingly not particularly inferior are such textbook ex¬ 
amples as hamburger, margarine, and white potatoes. The per capita 
weekly consumption of these foods plus bread and butter is displayed 
in table 5. These data clearly do not support either Alfred Marshall’s 
(1920) example of bread or textbook examples of potatoes during the 
Irish potato famine as possible Giffen goods, assuming stability of 
tastes over time. 14 


IV. Conjectures and Conclusions 

The minimum-cost diets presented in this paper are the kinds of diets 
that would be provided to slaves, although the calorie level might be 
elevated. In fact the diet of American slaves in the antebellum South 
resembled these solutions (see, e.g., Fogel and Engerman 1974; Bar- 
zel 1977, Kahn, in press). 

It would seem that other aspects of consumers’ behavior should be 
amenable to the framework utilized in this paper. Consider, for ex¬ 
ample, the choices consumers make about automobiles. In addition to 
transportation between two locations, cars also provide varying de¬ 
grees of style and comfort along the way. One should expect rapidly 
diminishing returns to the pure transport function of cars. Thus as 
incomes rise, one would expect a greater fraction of the price of a car 
to be attributable to style and comfort, and less to pure transporta¬ 
tion. The development of modern cars from the early Model T's 
certainly confirms this, but technological change is obviously also 
present. On a cross-sectional basis, it seems that various options (e.g., 
power steering) tend to become more commonly consumed as income 
rises. It would probably be difficult to purchase a stripped-down 
Cadillac or other car purchased mainly for the well-to-do. 

A similar analysis would apply to the case of housing. One should 


14 See also the recent article by Dwyer and Lindsay (1984). 
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expect rapidly diminishing returns to the pure shelter component of 
housing, that is, keeping warm and dry. As income rises, one should 
therefore expect to see a greater proportion of housing expenses 
directed toward commodiousness and less toward pure shelter. One 
would, for example, predict that the price of houses would increase in 
greater proportion than various hedonic measures of shelter such as 
square footage, after adjusting for the price of the lot on which the 
house stands. Applying theory in this manner may reveal new refut¬ 
able implications about consumer behavior. 


Appendix 

The following is a list of all foods utilized in this paper. 

1. Alilk products: whole, buttermilk, skim, low-fat, yogurt, chocolate, 
evaporated, condensed, dry. 

2. Cream, light, heavy, half-and-half, sour, powdered creamers, semisolid 
toppings, ice cream, ice milk, sherbet. 

3 Cheese American, Swiss, 2 percent cottage, cream, spreads, parmesan. 

4. Fats, oils butter, margarine, animal fat, vegetable oil, salad oil, regular 
mayonnaise, imitation mayonnaise, French dressing, low calorie, other. 

f>. Flour, meal: wheat, other; mixes; pancake, biscuit, cake, pie, cookie, 
other; hot oat, hot wheat cereal; cold coin, wheat, rice, oal cereal; corn- 
meal, grits, macaroni, popcorn, cornstarch. 

fi. liiead: white, whole wheat, other. 

7, Bakery products: crackers, rolls, muffins, cakes, pies, rookies, sweet buns, 
doughnuts, pretzels, roll, biscuit, and pancake batters. 

N. Beef steaks; round, sirloin, porterhouse, hamburger, other; roasts: 
chuck, rib, round, rump, corned and chipped beef, canned. 

f). Pork < hops, ham, loin, sausage, other; cured ham, bacon, salt pork, other 
tured pork, canned. 

10 Veal: chops, roast, slewing, 

11. Lamb: chops, roast, slewing. 

12. Miscellaneous, liver, miscellaneous lunch meats, franks, bologna. 

13. Fish, fresh/frozen, smoked, canned tuna, other canned, shellfish. 

14. Fggs fresh, processed. 

15 Sugar, sweets' granulated, powdered, brown; syrups: corn, maple, molas¬ 
ses, honey; jellv, jam, candy/toppings with/without chocolate or nuts, dry 
gelatin pudding, ready-to-eat pudding; ices/popsicles, icings. 

Hi. Potatoes white, sweet, canned white, canned sweet, french fries, dried, 
t hips. 

17. Fresh vegetables spinach, kale, collards, mustard greens, other dark green; 
broccoli, peppers, carrots, pumpkin/squash, tomatoes, asparagus, lima 
beans, snap/wax beans, cabbage, lettuce, okra, peas, other green, celery, 
cucumbers, onions, green onions, beets, cauliflower, corn, turnips, other. 

18. Fresh fruits, grapefruit, lemons/limes, oranges, other citrus, cantaloupe, 
strawberries, apples, bananas, cherries, other melons, peaches, pears, ap¬ 
ricots, avocados, grapes, pineapple, plums, rhubarb. 
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19. Canned vegetables: dark green deep yellow, tomatoes, asparagus baked 
beans, hma beans, snap beans, beets, corn, green peas sauerkriut other 

20. Canned fruits: citrus, apples, apricots, cherries, peaches, pears, pineapple 


21. Frozen vegetables: leafy, broccoli, deep yellow, asparagus, lima beans, snap 
beans, green peas, corn, mixed, other, strawberries, other frozen fruit. 

22. Juice: canned tomato, other vegetable, orange, grapefruit, other citrus, 
apple, grape, pineapple, other noncitrus; frozen orange, frozen noncit¬ 
rus (grape), fresh citrus. 

23. Dried vegetables, fruits: beans, peas/lentils, prunes, raisins, other. 

24. Beverages: coffee, tea, chocolate, cola, fruit, diet, fruit ade, beer, whisky, 
wine. 


25. Soups: ready-to-serve meat; condensed grain, meat, mushroom, tomato; 
dehydrated grain, meat, vegetable. 

26. Nuts, condiments: peanuts in shell, shelled; other nuts, peanut butter, cat¬ 
sup, barbecue sauce, pickles, olives, relish. 

27. Lenvenings: yeast, baking powder. 

28. Baby food: meat, egg yolk, vegetable, fruit, juice; mixtures: mostly grain, 
meat, vegetable, fruit, cereal, teething biscuits. 
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Despite the important role played by migrants’ remittances, no sys¬ 
tematic theory of remittance behavior exists and surprisingly little 
statistical evidence on determinants of remittances has appeared. 
This paper presents several hypotheses for motivations to remit, 
ranging from pure altruism to self-interest. However, a lar richer 
array of predictions emerges from a model of tempered altruism or 
enlightened self-interest in which remittances are one element in a 
sejf-enforcing arrangement between migrant and home. This range 
of arguments provides a framework for interpreting evidence on 
remittance patterns among individual migrants in Botswana. The 
paper closes with some thoughts on dualistic theories of develop¬ 
ment when mutually beneficial understandings between urban mi¬ 
grants and rural homes prevail. 


In examining determinants of fertility, marriage, and divorce, econo¬ 
mists have begun to address questions of household composition 
more traditionally posed by anthropologists and sociologists. Yet al¬ 
though membership in the household has now been exposed to 
econometric analysis, the unit of study is almost always the household 
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as a cohabiting group, eating from a common pot. I'he recent litera¬ 
ture on bequests does extend the familial context beyond that of the 
nuclear household to encompass intergenerational links for purposes 
of risk sharing, attention for the elderly, or storage of asset-specific 
knowledge (Kotlikoff and Spivak 1981; Rosen/weig and Wolpin 
1983; Bernheim, Shleifer, and Summers 1984). But not only in¬ 
tergenerational issues warrant consideration of a broader family, for 
in many societies intrafamilial remittances are observed on a substan¬ 
tial scale. At least for certain economic dimensions, it seems the unit 
of analysis may appropriately be extended spatially across households 
as well as intergenerationally. 

In spite of the potential importance of remitting as a private mecha¬ 
nism of income redistribution between persons and across sectors, 
permitting, for example, consumption in excess of locally generated 
incomes or granting access to an additional source of capital funds, no 
comprehensive theory of remittances exists. Moreover, there is sur¬ 
prisingly little statistical evidence on the motives for remitting, and 
the few studies that have appeared are not couched in terms of test¬ 
able hypotheses derived from a theoretical framework (Mohammad, 
Bun her, and Gotsch 1973; Johnson and Whitelaw 1974; Rempel and 
Lobdell 1978; Knowles and Anker 1981). 

Certainly the most obvious motive for remitting is pure altruism— 
the care of a migrant for those left behind. Indeed, this appears to be 
the single notion underlying much of the remittance literature. But 
household arrangements, particularly within an extended family, 
may be considerably more complex than such a simple form would 
suggest. Section IA sketches a theory of pure altruism, while Section 
lit develops a polar opposite; remittance motivated entirely by pure 
self-interest. Yet no doubt the world is more balanced. Section 1C 
therefore of fers an account of tempered altruism or enlightened self- 
interest. This views remittances as part of, or one clause in, a self- 
enforcing contractual arrangement between migrant and family. The 
underlying idea is that for the household as a whole it may be a 
Pareto-superior strategy to have members migrate elsewhere, either 
as a means of risk sharing or as an investment in access to higher 
earnings streams. Remittances may then be seen as a device for redis¬ 
tributing gains, with relative shares determined in an implicit ar¬ 
rangement struck between the migrant and remaining family. The 
migrant adheres to the contractual arrangement so long as it is in his 
or her interest to do so. This interest may be either altruistic or more 
self-seeking, such as concern for inheritance or the right to return 
home ultimately in dignity. Remitting may thus cease either if the 
arrangement is no longer self-enforcing or if the contractual terms 
themselves provide for cessation of transfers at a given point in time. 



MOTIVATIONS TO REMIT 


9°3 

In evolving these ideas in Section I, an attempt is made to draw out 
certain empirical implications of the various elements. Section II then 
explores these implications in the context of a recent household sur¬ 
vey conducted in Botswana, and Section III closes the paper with 
some more general reflections on the concept of a household and 
dual theories of development. 

I. Theories of Remittance 

A. Pure Altruism 

If one states only that a typical migrant enjoys remitting, no testable 
propositions emerge. However, more incisive results may be obtained 
from an altruistic model wherein the migrant derives utility ( u m ) from 
the utility of those left at home, and the latter utility is presumed to 
depend on per capita consumption (c/,). For example, suppose the 
migrant maximizes his own utility with respect to the amount re¬ 
mitted (r): 

M 

= u c m (w - r), ^ a h u(c h ) , (!) 

/»-1 

where w is the migrant’s wage, c,„ is his or her consumption, a h are 
altruism weights attached to various household members, and n is the 
household size. Consumption per capita may further be assumed to 
increase with income per capita available at the home base and may 
also vary with household size if there are economies or diseconomies 
of scale in consumption: 

o, = c(y + A (2) 

where y is the income per capita at home before receipt of any remit¬ 
tances. Choosing a level of r to maximize (1) subject to (2) provides 1 

r = r(w, y, n). (3) 

If the migrant indeed cares about his home family and if both his 
utility function (1) and the home family utility functions are well 
behaved, two properties of the remittance function (3) are predicted: 
that drldw > 0 and dr/dy < 0. The sign oidr/dn is unrestricted, however, 
depending on the presence of (dis)economies of scale in consump¬ 
tion, the rate of diminution in the marginal utility of home consum- 

1 implicitly, this treats teand v as given In particular, the migrant is assumed neither 
to work harder nor to accept worse working conditions with higher pas in order to 
remit, and no moral hazard is involved in the sense ol the home group’s reducing 
eflort. 
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ers, and whether specific preference exists for a subset of the home 
group. 


B Pure Self-Interest 

As a stark counterpoint, three reasons to remit are now considered, 
relying on purely selfish motivations and the absence of altruism by 
the migrant toward the family. 

i) The first is the aspiration to inherit. If we assume that inheritance 
is conditioned on behavior, an avaricious migrant’s motives for sup¬ 
porting his or her family, and particularly parents, may encompass 
the concern to maintain favor in the line of inheritance. If applicable, 
this should generally mean larger remittances the larger is the poten¬ 
tial inheritance. 

ii) A second self-interest of the migrant in remitting home may be 
to invest in assets in the home area and ensure their careful mainte¬ 
nance. In this context, one’s own family may be a particularly trust¬ 
worthy agent both in selecting the specific item for purchase (land, 
cattle, etc.) and in maintaining the asset on the migrant’s behalf. Al¬ 
truism of the family toward the migrant may underlie or enhance 
such a trust.' 

iii) Third is the intent to return home, which may suffice to pro¬ 
mote remittance for investment in fixed capital such as land, livestock, 
ot a house, iti public assets to enhance prestige or political influence, 
and in what might be termed social assets—the relationships with 
family and friends. Yet the last of these serves to illustrate how inex¬ 
tricable are motives of altruism and self-interest. In the end one can¬ 
not probe whether the true motive is one of caring or more selfishly 
wishing to enhance prestige by being perceived as caring. 

Tempered Altruism or Enlightened Self-Interest 

Both pure altruism and pure self-interest alone may be inadequate 
explanations of the extent of remittance and its variability, through 
time and across persons. In this section, an alternative theory is there¬ 
fore outlined, viewing remittances as part of an intertemporal, mutu¬ 
ally beneficial contractual arrangement between migrant and home. 


* Two forms of investment are worth specific mention in this context. ( a) It is, of 
course, not uncommon for children to be left in the village, both their formal education 
and upbringing being entrusted to family members with remittances intended to com¬ 
pensate for the costs (A) Remittances are often spent on religious structures or public- 
works in the home area, partly to enhance the prestige of the migrant and his family. 
The migrant's own family may fie immediate recipients of the remittances here, too, for 
they have a natural monopoly in determining the migrant's home village prestige. 
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This theory is not merely the intersection of pure altruism and pure 
self-interest but rather offers a quite separate set of hypotheses. We 
shall return to altruism and self-interest in accounting for enforce¬ 
ment of such mutual contracts. 

Two components may underlie the arrangements to be considered: 
investment and risk. It is well known that urban migrants are, on 
average, better educated than those remaining in the rural sector. 
The initial costs of this education, and particularly subsistence sup¬ 
port while not earning, are usually borne by the immediate family. 
Some authors have noted a positive association between the amount 
remitted and the education of the migrant and have interpreted this 
as repayment of the principal (plus interest) invested by the family 
(Johnson and Whitelaw 1974; Rempel and Lobdell 1978). It should 
be noted that such an argument does not necessarily require that a 
higher fraction of an enhanced, educated migrant’s wage be remitted: 
to reap a return on their investment, it is sufficient that the family’s 
receipts rise with the education of the migrant. Yet such a positive 
association is hardly a discriminating test of this explanation. Thus 
wages are generally higher for educated migrants, and the altruism 
model also indicates a positive association between remittance and 
wage and hence education. A more discerning test may be proposed, 
however. The costs of educating certain members of the particular 
household (e.g., children of the head) are far more likely to have been 
incurred by that household than are the costs for other members 
(daughters-in-law, sons-in-law, even spouses). The investment argu¬ 
ment would therefore predict that the effect of education on raising 
remittances should be greater among the former group than among 
the latter. 3 

Turning to the second component that may underlie mutually 
beneficial informal contracts, in an economy where insurance and 
capital markets are highly incomplete, the act of migration may be 
seen as a diversification response in the presence of risk. Risks of crop 
failure, price fluctuations, insecurity of land tenancy, livestock dis¬ 
eases, and inadequate availability of agricultural wage work may each 
render the rural context quite precarious (Stark and Levhari 1982). 
The household may elect to spread its risks by allocating some mem¬ 
bers to urban migration. Initial job search in town can, of course, also 
be risky, and even subsequent security of employment is not guaran¬ 
teed (Harris and Sabot 1982). But provided that the vagaries of the 


* Such investment arguments may be extended, e.g., to include initial costs ol reloca¬ 
tion—both financial costs of transportation and of preliminan |<>b search and ps\t hie 
costs (Sjaastad 1962). An investment model would suggest that the greater are am smh 
initial costs imposed on the remaining family, the greater should be eventual renin- 
tances in repayment. 
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rural and urban context are not highly positively correlated, it can 
then be mutually beneficial to both migrant and family to enter a 
coinsurance contract. Remittances as claims would then flow to the 
family at times of crop failure and to the migrant during spells of 
unemployment. What will thus be the consequences for remittance 
patterns during unexceptional times? It would seem that this should 
depend on who is the net provider of insurance, relative risk prefer¬ 
ences, and relative bargaining powers. As with many forms of insur¬ 
ance. an element of moral hazard may well emerge in such arrange¬ 
ments. Indeed, at instances when the household wishes, for example, 
to partake of a particularly risky strategy at home, trying some new- 
agricultural techniques, it may then pay to buy additional insurance 
and send more members to town. 

Arrangements between a migrant and family are voluntary and 
thus must be self-enforcing. Mutual altruism among close relations is 
the most obvious force in avoiding delinquency and presumably helps 
to explain why the family is the most frequent context of such ar¬ 
rangements. Such elements as those discussed in Section Ifl—an aspi¬ 
ration to inherit, convenience in rural investments, and the intent to 
return—mean that the migrant retains a vested interest in his origins 
beyond altruistic concerns. This interest increases the confidence of 
the family that the migrant will not default and hence encourages the 
prevalence of intrafamilial cooperative contracts. Moreover, these ele¬ 
ments of command of the family over its migrant members may con¬ 
tribute to determining the distribution of benefits. If the bargain 
struck between migrant and family as to the remittance pattern is 
af fected by elements of family command over the migrant, a predic¬ 
tion emerges that is clearly contrary to the pure altruism story. Within 
a game-theoretic view, greater wealth of the family should increase its 
relative bargaining strength. Thus whereas the pure altruism model 
predicts higher remittance to lower-income households, ceteris 
paribus, the reverse is implied by a bargaining model. 


II. Evidence from Botswana 

A. Specificatwn and Estimation 

To explore empirically some of the ideas that evolved in the forego¬ 
ing section, estimates of a rather complex remittance equation can 
now be reported. Each observation is an adult migrant from a particu¬ 
larly detailed household survey of migration conducted in Botswana: 
the National Migration Study of Botswana, 1978-79 (NMS). 

The model is of the following form: 
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3 

r = Po+X w,ai, + OL\y 4- a 2 n 4- €j* 4- e 2 *7 
1= 1 

4- K\k 4- K 2 ^ 4- 4- 6 jd 4- 8 2 £ + b$bd 4- 64 / 4- 65 Id (4) 

5 

4- <i>\f 4- <t> 2 ^ 4- <t> 3 ^ 4* t,/, 4- |tm, 

1 * 1 

where r = log of monthly remittances in cash and kind, averaged over 
the number of survey rounds in which the person was absent, or zero 
if no remittances occurred; w = dummies for ranges in monthly 
wages plus net self-employment earnings of the migrant; y = log of 
income generated by the home group per consumer unit; n = log 
number of consumer units at home (adults = 1, children = .5 
weighed by frequency of presence at home over the survey rounds); e 
= years of education completed by the migrant; / = a dummy equal 
to one if the migrant was a member of this household when young; k 
= a dummy equal to one if the household owns more than ‘20 cattle; i 
= a dummy equal to one for sons and daughters-in-law of the house¬ 
hold; p = a dummy equal to one for head of household and head’s 
spouse; <i = an index for ongoing local drought; b = log of number 
of cattle owned; / = log of crop acres; f = a dummy equal to one if 
female; h = a dummy equal to one if head of household; v = a 
dummy equal to one if child of head or of head’s spouse; t = break¬ 
points in a piecewise linear function for duration of absence (years); 
and rn = statistical hazard rate. 

The reasons for adopting this specification, and in particular the 
various interaction terms, may best be discussed together with the 
results. More precise definitions of certain variables w ill be given in 
the text, and sample means and standard deviations of variables are 
tabulated in the Appendix (table Al). 

Before actually turning to the results, two prior issues of estimation 
must be addressed. In the NMS, four interviews were conducted at 
each dwelling, one every 3 months, and each person reported absent 
on any occasion is included in the analysis as a potential remitter.' 


' Absentees ate defined to include all persons whom the head ol the household 
legards as normally lielonging to that lolwupn (home) v\ho did not sleep there on the 
previous night. It is neither clearly relevant to distinguish temporary from long-tetin 
migrants not simple as a practical matter I he Batswana arc amazingly pcripatetn , the 
margin between whether one is based at home and visiting elsewhere or vue \cisa is 
((lute illusive A man who has worked in town foi 10 oi -0 years will answer lhal he does 
not live there: be lives at ‘‘home" but Slavs in town. In the hrst round ol interviews, 
spokespersons were asked also whether anyone (resides absent members had iemitted, 
but the numlier oi positive responses was negligible, and the module was chopped iloni 
subsequent rounds. 
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The absentees may be divided into three groups: those in town, those 
elsewhere in the rural sector (including the commercial freehold 
farms), and those in South Africa. The last group is omitted from the 
present study for two reasons: first, most Botswana citizens employed 
in South Africa are in mines, and miners send or bring money home 
both in the form of remittances and through a deferred pay scheme, 
but the NMS does not record the amounts of deferred pay; second, 
there is insufficient information about variation in earnings among 
the absent miners. Thus two issues of sample selection arise: the selec¬ 
tion on who is absent from the home and, within this group, who is in 
South Africa and hence excluded. Moreover, for reasons to be dis¬ 
cussed later, separate estimates of equation (4) are undertaken for 
those absent in urban Botswana, adding a further selection criterion. 
To correct for potential sample selection bias, a hazard rate (assuming 
a normal distribution) is included in the remittance regressions. The 
hazard rates are calculated from the probability of sample entry, dis¬ 
tinguishing overall and urban only, among all adults including those 
absent in South Africa, as predicted by a binomial logit equation 
specified as 

2 __ ^ TTl) f 7f| f + Ttlh +- nyr + TXAg* +■ 7TV 4- ^ 

where z equals the odds of heing included in the remittance sample, g 
is age, and other terms are as previously defined. 

The second prior issue is that the NMS does not record earnings of 
any absentees. To circumvent this lacuna, earnings equations are es¬ 
timated from NMS data on persons present in urban Botswana, the 
fteehold farms, and the tribal areas (the residual rural sector). The 
estimates appear in the Appendix (table A2). Explanatory variables 
are confined to information known also about each absentee, so that 
absentees’ earnings may be predicted if they are reported to be work¬ 
ing. Earnings include both wages and net self-employment earnings. 

B. Results 

The estimated remittance equations for all absentees and urban ab¬ 
sentees only are reported in table 1, with ^-statistics for a zero null 
hypothesis in parentheses beneath each coefficient. 5 

In both contexts, remittances rise steadily with the earnings of the 
absentee. Although this result is certainly consistent with the proper¬ 
ties of the simple, pure altruism model (3), it is not a very discerning 


’Only those adults, ages 15-54, who are not in school or higher education are 
included The upper age limit is imposed to concentrate on a working age group. In 
this sense, 54 is too young, but the next age group coded m the survey is open ended. 



TABLE ) 

Estimated Remittance Equations 


V ariables 

All 

Absentees 

Urban 

Absentees 

Only 

Intercept 

-.514 

-.560 

Log absentee’s wage (/id): 

(5.35) 

(2.88) 

0 < /id s 3 

.251 

.409 


(3.59) 

(3.51) 

3 < (id < 4 

.472 

578 


(10.20) 

(6.50) 

4 < /u> 

.732 

.666 


(11.53) 

(5.61) 

Log home income per consumer unit 

.01 1 

.022 

(111) 

(1.24) 

Log consumer units at home 

.084 

.069 

(2 42) 

(1.01) 

Years of schooling 

- .002 

- 034 


(.23) 

(162) 

Years o( schooling * own young 

.014 

.015 


(1.50) 

(.88) 

Number of cattle owned > 20 

- .054 

061 


(1.06) 

(.47) 

Number of cattle owned > 20 * sons and their spouses 

.154 

.228 

(2.26) 

(1 80) 

Number of cattle owned > 20 * heads and spouses 

.162 

465 

(1.85) 

(1.93) 

Drought 


.066 

(25) 

Log cattle 


-.071 
(1 75) 

Log cattle * drought 


.165 

(1.83) 

Log crop acres 


- 011 

(.32) 

Log crop acres * drought 


184 
(1 66) 

Female 

141 

.166 


(3.29) 

(1 89) 

Mead of household 

.498 

801 


(7.50) 

(5.69) 

Child of head 

Oil 

- .024 

Duration of absence: 

(.24) 

(29) 

.5 year 

.155 

117 

(2.30) 

(.88) 

1 year 

.211 

.139 

(3.17) 

(1 10) 

2 years 

.334 

.209 

(4 66) 

(1.56) 

5 years 

.530 

475 

(6.79) 

(3.32) 

50 years 

.056 

.061 

(.37) 

( 18) 

Hazard rate 

15.4 

59.2 


(3.07) 

(3.79) 

K 2 

.22 

.21 

E-statistic 

37.3 

11 4 

Number of observations 

2,531 

1,027 
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test. For example, ability to coinsure or signaling concern when aspir¬ 
ing to inherit would also imply rising remittance with earnings. But 
the kernel of the pure altruism model is the per capita income of the 
home group. Instead of being negative as the pure altruism model 
hypothesizes, the estimates tend to show a positive association be¬ 
tween amount remitted and per capita income of the household from 
other sources. 1 ’ In a dynamic setting, one cannot rule out the possibil¬ 
ity thai past remittances, sent with altruistic intent, have helped to 
raise today’s income. To disentangle satisfactorily such life-cycle mod¬ 
els ideally requires longitudinal data. Nonetheless, these cross- 
sectional results do show that, at the particular time of the survey, 
absentees were not remitting more to support lower-income home 
groups out of any given level of earnings. 

To test whether remittances are part of an intertemporal under¬ 
standing to recompense the family for initial sacrifices during the 
migrant's schooling, it is essential to distinguish groups for whom the 
family has made such sacrifices. This distinction cannot be made 
definitively from the NMS data or indeed from most household sur¬ 
veys. But groups for whom the costs of schooling are more likely to 
have been borne by this specific family may be defined. In the present 
context, the more likely group, labeled "own young” in table 1, is 
assumed to consist of children of the household head, grandchildren, 
and nieces and nephews.' Both in the overall regression and for ur¬ 
ban absentees only, the coefficient on years of schooling interacted 
with the dummy for own young tends to be positive, though 
confidence levels for these tests are not high. However, it was noted in 
Section \C that repayment of school costs does not require that a 
higher fraction of a given wage be remitted, only that remittances rise 
with education by those in whom investments are made. If the overall 
equation of table 1 is therefore reestimated omitting the wage class 
dummies, it is estimated that remittances not only rise significantly 
with years of schooling for all absentees but also rise significantly 
more (at a 93 percent confidence level) among the household’s own 
young. Thus support is certainly lent to the notion that remittances 
are partially a result of an understanding to repay initial educational 
investments. 

Inheritance customs and laws among the Batswana are neither uni¬ 
versal nor immutable. Practices vary from tribe to tribe and within a 


" This result proves robust ovei a wide range of specifications, including bums in 
which land and cattle owned arc omitted. 

* Uncles pay an important role in Batswana kinship patterns II nieces and nephews 
are reported as absent members of a household, their education quite probably has 
been the responsibility of this household, and their own parents may well live with the 
household 
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given tribe. 8 Indeed, a statistical study of inheritance, whether in 
Botswana or elsewhere, truly requires a specialized survey if only to 
record the existence of and links with children who are no longer 
members of the head’s household, having established their own 
home. Nonetheless, some limited, suggestive evidence may be ob¬ 
tained from the NMS. On average, sons are more likely to inherit 
than daughters or other household members, though their inher¬ 
itance is not assured. It may therefore be asked of the data whether 
sons (and their spouses) remit more to families with greater amounts 
of inheritable wealth and whether this differs from the practices of 
others. Cattle are the dominant form of inheritable wealth in Bots¬ 
wana. All land (other than towns and freehold farms) is common 
property of the tribes. In the regression of table 1, a dummy variable 
is included for whether the household has a cattle herd larger than 20 
beasts. 51 Both overall and from urban areas only, sons do remit more 
to families with larger herds, the effect passing a 2 percent and a 7 
percent significance test, respectively. On the other hand, if an addi¬ 
tional term is inserted, interacting the dummy for more than 20 cattle 
with a dummy for daughters and their spouses, the associated 
coefficient proves to be weakly negative. Thus sons do behave 
significantly differently from daughters and from other relatives in 
remitting more to households with large herds, which is consistent 
with a strategy to maintain favor in inheritance. 

However, it is also common for sons to keep their cattle with those 
of the household, and, in fact, the distinction is not always clear-cut. 
1'hus an alternative or additional motive for sons to remit to house¬ 
holds with larger herds is for maintenance or indeed expansion of 
their herd. Whether the motive here is to inherit or to have the 
household care for and acquire cattle on behalf of the son (both 
arguments listed in the pure self-interest model in Sec. I) cannot be 
discerned. Indeed, it is not obvious that they are truly distinct argu¬ 
ments in this context unless the son is assured that the demarcation of 
his rattle will be rigidly maintained. 

The view of remittances as part of a coinsurance contract indicates 
certain ideas that also may be explored empirically. During times of 
particular hardship in rural areas (such as that of drought in Bots¬ 
wana), this approach predicts that urban to rural remittances should 
increase as claims are made against the coinsurance arrangements. 
However, such a test is not definitive, for a pure altruism model has 


" Set' Schapera (1938), esp. pp. 214-38, and a more rerem anount in Roberts (1972) 

'* A herd of 20 beasts is often adopted as a significant dividing line in Botswana. A 
tommon argument for this particular breakpoint is that some eight beasts are required 
tor ploughing and a total herd of 20 is necessary to assure eight adult ploughing 
animals. 
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the same prediction as family income declines. A more discerning test 
is whether during times of crisis remittances are relatively higher to 
households possessing assets particularly sensitive to such incidents. 
During droughts are remittances to operators of cropland and own¬ 
ers of livestock greater than to other households? 

I'he year of the NMS survey, 1978—79, happened to be one of 
substantial drought, though its severity varied from location to loca¬ 
tion. For each village sampled, the severity of drought is measured by 
an index defined as 1 — (actual rainfall/30-year average rainfall) for 
that location. In the urban absentee remittance equation in table 1, 
this index is included separately and also interacted with the 
logarithm of crop acres operated and of cattle owned. The reasons 
lor confining this analysis to the urban case alone are twofold: (a) for 
the household to spread risks, it may well be wiser to send migrants to 
town (or South Africa, if possible) rather than elsewhere within the 
rural sector, where outcomes are probably more highly positively cor¬ 
related; ( b ) the extent of drought in each absentee’s rural destination 
is not known. 

If the two interaction terms of drought with cattle and with land are 
omitted, the coefficient on the drought index alone proves 
significantly positive—the more severe the drought, the greater are 
remittances. But in the specification reported in table 1, with the 
interactions included, it is seen that the existence of drought condi¬ 
tions and the possession of more cattle or more cropland have noth¬ 
ing to do with stimulating greater remittances per se. The interactions 
of drought with these drought-sensitive assets do. Those families who 
are at risk of losing cattle unless feed and water rights can be pur¬ 
chased, those who are at risk by virtue of normally relying on crops 
for more of their sustenance, are the ones who receive greater remit¬ 
tances during the drought. This is precisely the response one should 
expect if households allocate members to urban migration in order to 
insure against adopting risky asset portfolios at home. 10 

Although some evidence is thus found to support such ideas as 
repayment of school costs, aspirations to inherit, and coinsurance, this 
is not to deny the importance of caring for one’s family. Indeed, 
remittances by heads of households are substantially and significantly 
greater than those by other absentees, no doubt reflecting a sense of 
caring and responsibility on behalf of family heads. Interestingly, 


111 An alternative interpretation—that during drought members migrate to town in 
order to remit—may be ruled out If the drought index is interacted with a dummy 
variable for having been in town 6 months or less, the estimated sign on the associated 
inefficient is negative. Remittances during drought thus come from those members 
already well plated in town. 
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however, children of the head do not remit more than any other 
members of the household, ceteris paribus, as indicated in table 1. 

Last, the role of duration of absence may be mentioned. If out of 
sight, out of mind were the rule, one should expect remittances to 
fade with duration of absence. If repayment of school costs were the 
target, again remittances should ultimately cease. Indeed, the piece- 
wise linear terms on duration of absence do suggest an ultimate dimi¬ 
nution in remittance, but not within the first 5 years, during which 
remittances continue to rise. 11 It seems that those who continue to be 
identified as household members are very persistent remitters, and 
vice versa. 

III. Concluding Remarks 

Several elements in a theory of motivations to remit are outlined in 
this paper; empirical implications are elicited and explored using evi¬ 
dence from Botswana. In contrast, the small number of prior statisti¬ 
cal studies of remittance correlates have been primarily concerned 
with discerning patterns rather than developing an explicit behavioral 
model. 

The single notion underlying most previous work is that migrants 
remit because they care for those left behind. Unless qualified, this 
idea has no empirical content. It does not begin to answer why some 
migrants remit more, why some remit for longer, or why some do not 
remit at all. Yet altruism as a motive may be refined, and in Section I a 
model of pure altruism was presented. But the evidence of Section II 
does not lend support to one of the main components of such a 
model, that remittances should be greater to lower-income families, 
ceteris paribus. Altruism alone does not appear to be a sufficient 
explanation of the motivations to remit, at least not in Botswana. 

But this does not deny that altruism may be an important or even 
critical component. Rather, the evidence in this paper lends support 
to a more eclectic model, which Section I labels tempered altruism or 
enlightened self-interest. This views the migrant and family as having 
an implicit understanding that is of mutual benefit. For the household 


11 II is interesting to compare the initial rising remittance profile with the estimated 
asymptotic rise in urban earnings. The urban coefficients in table I indicate a 7 percent 
rise in remittances as absence increases from 1 to 2 years and a further cumulative rise 
of 30 percent over the next 3 years. In both intervals the earnings equations in the App. 
imply a rise in earnings of 19 percent for males and 31 percent for females. This 
suggests that the remittance profile rises less rapidly than that of earnings in the earlier 
period but that then the two approximately keep pace. During the initial, very uncer¬ 
tain period, the migrant is not expected to support the family, but once successfully- 
settled, such an undertaking becomes apparent. 
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ibe greater is the degree of ongoing drought 

home atea, the mote ’^ rllotlV ated by simple altruism. 

Once the drought index is interacted with amounts of cattle and land 
possessed, assets a. nsk during drought, drought alone ts not post- 
tuelv .issociated with remittance. Migrants act not to relieve all 
lamilies m dronght-stricken aieas but to protect during the drought 
those families with more cattle. This is quite consistent with families 
reaching art understanding that urban members will provide insur¬ 
ance during drought, and this permits the household to pursue a 
liskier rmal strategy. 

A second source of potential gain to the overall household is de¬ 
rived from investing in the education of youngsters who then migrate 
to town to reap returns and remit to rejiay the family’s outlay. To lest 
this, it is insufficient merely to note that remittances rise (or even rise 
as a fraction of wage) with education, as previous studies have sug¬ 
gested. However, in Sec tion II, a contrast was drawn between house¬ 
hold members in whose education the family is more likelv to invest 
cotnpated to others. F.vidence is found to suggest that the family’s 
ow n youngsters do increase remittances more with education than do 
others, tending to confirm a repayment hypothesis. 


Hut how can such understandings between migrant and family be 
enforced.'’ For example, if schooling costs are incurred before migra¬ 
tion, what incentive does the migrant have to remit subsequently? 
1 liree classes of self-seeking motives on behalf of the migrant are 
postulated: to aspire to an inheritance; to channel one’s rural invest¬ 
ments through the trustworthy family both as purchasing agent and 
for subsequent maintenance; and to retain the prospect of ultimately 


returning home with dignity. Again, the Botswana data are highly 
suggestive (though not definitive), showing sons remitting more to 
households with more cattle. Moreover, this effect is stronger among 
sons tfian among daughters or others. Sons are more likely to inherit, 
though certainly not necessarily so, and they need to maintain favor 
with the head of the household. 

Yet self-seeking motives of the migrant do not complete the mecha¬ 
nisms of enforcing intertemporal arrangements. Indeed, so far, noth¬ 
ing in the eclectic model explains why remittances are concentrated 
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within the family or why the family should be the source of school 
investment and insurance. Altruism is a relatively family-specific as¬ 
set. It is precisely the existence of intrafamilial mutual care that en¬ 
courages such arrangements to be confined within the family and 
helps to reinforce arrangements that are of mutual benefit. Both 
altruism and individual gain are components: altruism is tempered 
and self-interest enlightened. 

In many senses, this paper may be seen as beginning to extend the 
recent intergenerational view of the household to a spatial dimension. 
But such an extension is not trivial. It has quite profound implications 
for many elements in conventional wisdom: the distribution of in¬ 
come across geographically extended households may bear little re¬ 
semblance to the more conventionally measured distribution across 
dwellings, and neither is uniquely appropriate; high urban earnings 
create a labor aristocracy that must be reevaluated if that aristocracy is 
part of a plebeian set of households; and dualistic theories of develop¬ 
ment must be revised. Instead of an urban sector and a rural sector, 
each with its own populace benefiting from the sectoral-specific 
speeds of development, the family straddles the two. Classes cease to 
be only peasants and workers, and a hybrid peasant-worker group 
emerges. This perception is not new to anthropologists but has not 
previously been integrated with the economics of the household. 
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Money and Asset Prices in a Cash-in-Advance 
Economy 


Lars E. O. Svensson 

University of Stockholm 


An asset-pricing model with money introduced via a cash-in-advance 
constraint is presented. The monetary velocity is variable; hence 
money demand does not obey the trivial quantity equation. The 
effects of disturbances in output and money growth on real balances, 
the price level, and interest rates are examined. Monetary policy has 
effects on real asset prices. The Fisher relation and the premium on 
nominal bonds are discussed. The precise role of the timing of infor¬ 
mation and transactions for properties of price levels and interest 
rates are clarified. 


I. Introduction 

This paper is a study of the demand for money, of the determination 
of the price level and nominal and real interest rates, and to some 
extent of asset prices in general in a monetary economy; in particular 
of how these variables are affected by changes in money supply and in 
income. The purpose of the paper is mainly methodological. It shows 
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one possible way of integrating monetary, financial, and real aspects 
in a general ecjuilibrium framework. In particular, the equilibrium is 
simple and yet reasonable, and the model should be possible to apply 
to a variety of issues. One application is included in the paper, namely 
a discussion of the Fisher relation and the premium on nominal 
bonds relative to indexed bonds. Other possible applications are dis¬ 
cussed in the concluding section. 

I derive the demand for money by treating money symmetrically 
with other assets. Ordinary assets have a value and are held because 
lhey give a return—dividends. In complete analogy, one can think of 
money as having a value and being held because it gives a return- 
liquidity services. Once these liquidity services have been specified, 
(he pr ice of money c an he determined by an asset-pricing equation as 
ihe price of other assets; only the liquidity services replace the divi¬ 
dends in the equation. This idea is not new; here I follow, for in¬ 
stance, the works of Dixit and Goldman (1970), Kouri (1977), Fama 
and Farber (1979), Hodrick (1981), Jones (1983), Stulz (1983), and 
Lekoy (1984a, 19846). They all derive a demand for money by con¬ 
sidering money as an asset that pays liquidity services rather than 
dividends. This literature has some shortcomings, though. First, with 
the exception of Jones (1983) and LeRoy (1984a, 19846), it is not fully 
general equilibrium in the sense that the stochastic processes of some 
or all prices and interest rates are exogenously given, and not func¬ 
tions of more fundamental stochastic processes, of money supply and 
output, say. Second, by putting real balances directly into the utility 
functions, it assumes that real balances give direct utility, and the 
liquidity services of money are simply given by the direct marginal 
utility of real balances. 1 2 

The present paper attempts to improve upon this literature in both 
these respects. First, by following the analysis of the two seminar 
papers by Lucas (1978, 1982) and using his general equilibrium asset¬ 
pricing model of a pure exchange economy, I can indeed specify a 
stochastic steady state where all prices and interest rates are endoge¬ 
nous functions of exogenous stochastic processes of money supply 
and output/’ Second, instead of postulating that real balances give 
direct utility, I derive the demand for money via a cash-in-advance 
constraint, the liquidity constraint associated with Clower (1967) and 
thoroughly discussed by K.ohn (1981, 1984). This way the liquidity 


1 Fama and Farber (1979) and Slulz (1983) assume that consumption goods and real 
balances jointly produce consumption services, which in turn give direct utility. This is of 
course the same assumption. 

2 A paper by Danthine and Donaldson (1983) extends Lucas's (1978) barter model to 
include real balances in the utility function. Only endowment shocks are dealt with, 
though. 
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services of money are endogenously determined by the value of relax¬ 
ing the liquidity constraint—the shadow price of the liquidity con¬ 
straint. However, when introducing the cash-in-advance constraint 1 
avoid a common pitfall of the cash-in-advance literature, namely that 
the elaborate structure results in, or is by assumption restricted to, the 
trivial quantity equation with a unitary income velocity of money. 1 do 
this by generalizing Lucas's (1982) cash-in-advance model. 

In Lucas’s model consumers enter each period with a portfolio of 
assets, but (in equilibrium) without any money. In the beginning of 
each period they learn the current state of the economy after which 
they trade assets and money. Then they buy consumption goods. 
Consumption goods must be paid for with cash, which gives rise to the 
liquidity constraint. With this setup, consumers acquire cash after 
they know the current state and their current consumption. Lucas 
restricts the discussion to equilibria where nominal interest rates are 
positive. With positive interest rates consumers avoid interest losses 
on excess cash balances by acquiring exactly the amount of cash they 
need to buy their consumption goods. The result is that equilibrium 
cash balances acquired within each period are exactly equal to the 
nominal value of output. As in most other cash-in-advance models— 
for instance, Wilson (1979), Helpman (1981), Fersson (1982), and 
Kouri (1983)—the demand for money is then given by the simple 
quantity equation, since the income velocity of money is driven up to 
its technological maximum and is identically equal to unity. There is 
then only a transactions demand for money, and no precautionary 
demand. 

My model differs from Lucas’s in that consumers must decide on 
their cash belongings before they know the current state and hence 
before they know their consumption. 3 This gives rise to a combined 
transactions, precautionary, and siore-of-value demand for money. 
Money is held for the future liquidity services it provides, and the 
value of these liquidity services is the value of relaxing the future 
liquidity constraint. In particular, positive nominal interest rates are 
no longer inconsistent with a liquidity constraint that is nonbinding in 
some states. Since 1 allow nonbinding liquidity constraints, the simple 
quantity equation does not hold, and I get a more general—and more 
reasonable—demand for money. 4 


’ Lucas (1982) mentions the possibility of this modification in his concluding section. 

1 Previous papers with cash-in-advance models with a variable nonunilary monetary 
velocity include Goldman (1974), Lucas (1980), Stockman (1980), Helpman and Razin 
(1982), and Krugman, Persson, and Svensson (1985). These papers derive a combined 
transactions, precautionary, and store-of-value money demand, but they do not specify 
a full stochastic stationary general equilibrium with several assets, and they do not 
exploit the symmetry between the asset-pricing equations for money and other assets. 



()22 JOURNAL OK POLITICAL ECONOMY 

in .summary, relative to Dixit and Goldman (1970), Kouri (1977), 
Faina and Fat her (1979), Jones (1983), and Stulz (1983), I have a full 
general equilibrium solution rather than a partial equilibrium (again 
with the exception of Jones) and a better micro foundation of money. 
Relative to Lucas (1982) and most of the cash-in-advance literature, I 
have a more reasonable demand for money with variable velocity. In 
particular, I have an explicit and very simple solution to the equilib¬ 
rium. 

In Section 11,1 present the model and derive the equilibrium equa¬ 
tions. In Section III, I examine equilibrium real balances and interest 
rates as functions of stochastic disturbances in monetary expansion 
and income. In Section IV, I look at the Fisher relation and the 
premium on nominal bonds relative to indexed bonds. In contrast to 
previous partial equilibrium literature—for instance, Kama and Far- 
ber (1979)—I can specify the effects on these parity conditions of the 
parameters of the fundamental stochastic processes of monetary ex¬ 
pansion and income, and it turns out that these effects differ from 
results of previous general equilibrium approaches involving the 
quantity theory, for instance, Kouri (1983). Section V discusses the 
importance of the timing of information and transactions and com¬ 
pares in some detail with Lucas (1982). Some conclusions, limitations 
of the analysis, and suggestions for further research are presented in 
Section V'L 

II. The Model 

1 consider a closed monetary economy. A representative firm pro¬ 
duces a stochastic output y, in period t, 1, 0, I.of a 

single nonstorable good. I he supply Af, of money in period l is ran¬ 
dom and is given by 

A?, + 1 =u>,A?„ (I) 


Lucas (19N0) considers an economy where money is ihc only asset Helpinan and Razin 
(limy) and Krugman, Persson, and Svensson (1985) consider a two-period general 
equihhiium with bonds and money, but with very asymmetric periods A recent paper 
bv I.ucas (1985). which 1 received alter the hrst version ot the present paper was 
written, overlaps to some extent with the present paper In his papei a model is 
presented with some information arriving both after the goods market and before the 
asset market, as in Lucas (1982). and after the asset market and belore the goods 
maiket, as in the present paper The possibility of a precautionary demand for money 
and a iionunnary income velocity ot money is emphasized. Lucas derives and discusses 
in general terms the equilibrium conditions characterizing the equilibrium, but no 
explicit solution to a stationary stochastic equilibrium is presented, and the equilibrium 
is not subject to a detailed examination. Townsend (1982) presents a stochastic mone¬ 
tary asset-pricing model with cash-in-advance constraints and discusses asset pricing 
and rate-of-return dominance of money in general terms No explicit solution to the 
equilibrium is given, except for a two-date deterministic example. 



MONEY AND ASSET PRICES 


9 2 3 


where to,—(the gross rate of) monetary expansion—is stochastic. Call 
s, = (y„ to,) the state in period t. It follows a Markov process with the 
probability distribution of i,+ i given by the distribution function 
F(s,+ 1 ; Si). That is, the probability distribution of j, + , depends on the 
realization of s, only. 

The economy has a representative consumer with preferences 


E,Xp t -'u(c t ), 0 < (J < 1, (2) 

T =* t 

in period t, where u(c) is a concave utility function of consumption c 
and E, is the expectations operator conditional on information 
available at t. We shall first look at the decision problem of the con¬ 
sumer. We assume that there exist traded claims, shares, to the firm’s 
dividends, which equal the firm’s sales of output. Also the consumer 
receives net transfers of money in each period. 5 

The timing of events is crucial: the consumer enters period t with 
predetermined holdings M, of money and a predetermined share z, of 
claims to the firm’s dividends. He learns the current state s, and then 
has the opportunity to purchase goods with money, at a price P(s t , 
Af,).° His purchases c, must obey the liquidity constraint 

P(s„ M,)c, s M,. (3) 

After the goods market is closed, at the end of period t he receives, in 
cash, his share of dividends, P(s„ M,)y,z„ as well as the period’s lump¬ 
sum money transfers, (u), - 1)A/,. Note that the consumer by assump¬ 
tion receives the money transfer after the close of the consumption 
goods markets; hence the money transfer cannot be used to buy con¬ 
sumption goods within the same period. Put differently, money trans¬ 
fers—when regarded as assets—are by assumption not more liquid 
than shares in that they do not pay their “dividend” until after the 
consumption market is closed. When regarded as liabilities, taxes are 
paid at the same time as cash income is received; the consumer does 
not have to hold cash as a precaution for random tax payments. 

Finally, at the end of period t, money and shares are traded. The 
consumer then faces the budget constraint 


' As emphasized by Lucas (1978, 1982), there is in general a certain arbitrariness in 
the number and kinds of assets assumed traded in these models, as long as there exist 
sufficiently many assets to ensure existence of a stochastic stationary equilibrium. In the 
present case, the shares can be deleted if dividends can be distributed directly to the 
representative cons diner. 

" Here prices are a function of the current state and money stock, but not of the 
current date, which presumes the stochastic stationary rational expectations equilib¬ 
rium that is to be defined below. 
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M,+ , + 0.v„ A?,)z I+ , < [A/, - MtM 

4- [(2(5,, M,) + P(s„ M,)y t \z, 4- (w, - 1)M ( . 

Here Af, + i and z, hl are new holdings of money and shares, to be 
carried into period t 4- 1; Q(s,, M t ) is the share price in terms of 
money. The first term on the right-hand side is cash not spent on 
consumption, and the second term is the gross return on shares. 

1 now introduce the (real) price of money (i.e., its purchasing power 
in terms of goods), it, = 1 /P„ and the real price of shares, q, = Q<IP t . 
For simplicity, primed variables denote variables at time t 4- 1 at state 
s,, i and money stock M l+I , whereas nonprimed variables denote vari¬ 
ables in period l at state s, and money stock M,. Then the budget 
constraint (1) and the liquidity constraint (3) can be written as 

c 4- itAT 4 qz S irA/ + (q 4- y)z + ^(co - 1)M m w (5a) 

and 

c "= vM. (5b) 

Here the right-hand side of (5a) is identified with total (real) wealth in 
period I. Wealth in period t 4- 1 will hence be given by 7 

w' = ti'M' 4- (q‘ 4- y')z' 4- it'(co' - l)M'. (5c) 

The consumer will maximize (2) subject to (3) and (4). In an equilib¬ 
rium we will have 

(, = y h M, = M„ M,+ \ = M t + i = co^Vf,, and z, = 1. (6) 

That is. the goods, money, and share markets clear at each date. 

We assume the existence of a unique stochastic stationary rational 
expectations equilibrium. 8 In such an equilibrium, the probability dis¬ 
tributions of the exogenous variables are independent of t. Then the 
solution to maximizing (2) subject to (3) and (4) gives the value func¬ 
tion v(u\ M, s, M) defined implicitly as the maximum of 

u(r) 4- p ) v(w', M', s', M')dF(s'; s) (7) 


7 Note that here the variables M' and z' are choice variables at period t and hence, in 
equilibrium, not functions of s' and M' but of s and ffl. 

" The methods to be used in proving the existence of such an equilibrium are demon¬ 
strated in Lucas (1978, 19812), where existence is proved for his barter and monetary 
models. Townsend (1989) proves the existence of equilibrium in his monetary asset¬ 
pricing model. One particular problem that arises in my model is that for arbitrary M„ 
<u„ and M, the budget set given by (4) may be empty for some states. Such situations can 
arise if M, is small and (w, - 1 )M, is large and negative. This can be resolved as follows: 
let there be an e > 0 such that o> > t with probability one. That is, u> is never below a 
given positive e. Then add the constraint Af,» , a (I - i)M, to (4), which ensures that 
the budget set will be nonempty for all t. In equilibrium, M, = M, for all I and the added 
constraint is slack. 
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over c, M ', and z', subject to (5). We let X and p be the Lagrange 
multipliers associated with the constraints (5a) and (5b), respectively. 
We have, by standard properties, that the partials of the value func¬ 
tion fulfill 


v w = X, v M = pir. (8) 

We combine (6) and (8) with the first-order conditions for maximizing 
(7) subject to (5). It is practical to introduce the notation m, = 1 xftl, for 
real balances in (the beginning of) period t. Then we can, after some 
manipulations, finally write the equations that define an equilibrium 
in period t, for each s = (y, w), as 

y & m(s) [p(s) a 0], 

X(j) + pfi) = «<■(?). 

X(r)mfi) = _ g£[“t (2 >(O L * I 
0 ) 

and 

Ms)q(s) = p£|X(s’)(< 7 (s') + /]; sj. 

Here the variables m, X, p, and q can be solved as functions of s only, 
and not of M; 9 £[/(r'); 5 ] denotes } f(s')dF{s '; s) (all expectations 
throughout this section are conditional on s). The price of money will 
be a function of both s and M and given by 

rifi, Af) = (9e) 

M 

Equation (9a), the liquidity constraint, says that real balances m 
equal or exceed output y. By (8) the multiplier p is the marginal utility 
of real balances (pir is the marginal utility of [nominal] money). The 
expression [p s 0] denotes the usual complementary slackness condi¬ 
tion; if the liquidity constraint is not binding, the marginal utility of 
real balances is zero; and if the marginal utility of real balances is 
positive, the liquidity constraint is binding. By (8) X is the marginal 
utility of wealth, and by (9b) the sum of marginal utility of wealth and 
of real balances equals the marginal utility of consumption. Put dif¬ 
ferently, the existence of a binding liquidity constraint drives a wedge 
between the marginal utility of wealth and the marginal utility of 
consumption, since wealth cannot instantaneously buy consumption. 

Also note that p is the liquidity component of only the marginal 
utility of real balances, what can be called the liquidity services of real 


(9a) 

(9b) 

(9c) 

(9d) 


9 Intuitively this follows since M does not enter as a separate exogenous variable in 
the equations. 
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balances. Real balances are also part of wealth, and the total marginal 
utility of real balances is \ + p. 

Expression (9d) is the standard capital-asset-pricing equation for 
the value of a claim to dividends from sales of output. The price of 
the claim is equal to the discounted expected next period’s marginal 
utility of wealth times the gross return on the claim, divided by the 
present marginal utility of wealth. The gross return is the sum of next 
period's value of the claim q\ the “indirect" return, and next period’s 
“direct” return y*. Equation (9d) can be solved forward in the usual 
way to give 

q, = X P T ~' E(k l r '-- - (1») 

T = I * I 


' I "hat is, the price of the claim equals the discounted sum of all future 
periods’ expected marginal utility of its dividends (conditional on 
current information), divided by the present marginal utility of wealth. 

Equations (9d) and (10) are completely standard capital-asset- 
pricing equations, except in one way. The marginal utility of wealth, 
\, is here not always equal to the marginal utility of consumption. As 
we have noted the existence of a binding liquidity constraint drives a 
wedge between the marginal utility of wealth and that of consump¬ 
tion, and the marginal utility of wealth is then less than the marginal 
utility of consumption, the difference being the marginal utility of 
real balances. Clearly the existence of the liquidity constraint will 
affect the pricing of the claim. 

It is of interest here that equation (9c) can also be interpreted as a 
capital-asset-pricing equation: by using m = ttM, in' = u'M' = 
Tr'ca/W, and (9b), we can write equation (9c) as 

\tt = p/i[(\' + p')tr']. (9c') 


Equation (9c') can of course also be solved forward to give 

•X. JL 

TT, = ^ JjT < E (M-t'Tt. \) _ pT -C T- ■>>) (11) 

That is, the real price of a unit of money is the discounted sum of all 
future periods' expected marginal utility of money, divided by the 
present marginal utility of wealth. Hence, money is priced in com¬ 
plete symmetry with other assets, once its direct return has been ap¬ 
propriately defined. The direct return to money in this case is simply 
the value of the liquidity services provided by money, measured by 
the value of relaxing the liquidity constraint, which value is given by 
v M = pit."’ 


We should emphasize that the pricing eqq. (9c)-(l l) do not require the existence 
ot a stochastic stationary state. Then the variables in each period l also depend on I. 
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In the present framework, as in Lucas (1982), the price of any 
arbitrary asset can be determined only if its direct return as a function 
of the state is specified. Let us compute nominal and real rates this 
way. 

Let the nominal interest rate, i , be defined from the nominal pres¬ 
ent value of a nominal bond that pays one sure unit of money in the 
next period. More precisely, the bond is bought on the asset markets 
at the end of period t, and it pays one unit of money at the end of 
period / + 1. The real present value at the end of period t of this 
bond, that is, the nominal present value deflated by the price level, is 
The present value measured in money is p£(A'ir')/A it. 
Hence, the nominal rate of interest is given by 1/(1 + i) = 
P£(\'Tr')/\iT. Using (9c) and rearranging, we have 


. = %V) 
£(AV) ' 


( 12 ) 


Hence, the nominal interest rate is the expected utility of the liquidity 
services of money over the expected utility of wealth of money." 

We see from (12) that the nominal interest rate cannot be negative, 
and it is strictly positive when the expected next period's marginal 
utility of liquidity services is positive, that is, when the liquidity con¬ 
straint is binding in at least some state with positive probability. 

The real interest rate, p, is defined from the real present value of an 
indexed bond that can be bought at the end of period / and that pays 
one sure unit of real wealth at the end of period t + 1. It is given by 
1/(1 + p) = p£(\')/A; hence 


A 

P£(A') 


1 . 


(13) 


The real interest rate thus defined is the present marginal utility of 
wealth over the discounted expected marginal utility of the next pe¬ 
riod’s wealth, minus one. It is related to the usual expected intertem¬ 
poral marginal rate of substitution, although not between consump¬ 
tion in the current and next period, but between wealth in the current 
and next period. 12 


” Put differently, the nominal interest rate is the ratio between the expected mar¬ 
ginal utility of the direct return on money and the expected utility of the indirect return 
on money. We can understand this relation as follows, money and bonds have the same 
indirect return. For one unit of money held as cash at the end of period t and one unit 
of money invested in a nominal bond at the end of period t, both “pay” one unit of 
money at the end of period I + 1 as indirect return. The expected utility of that indirect 
return is (5£(W). In equilibrium, the expected utility of the liquidity services of one 
unit of money must equal the expected utility of the interest on one unit of money 
invested as a nominal bond. This gives rise to (12). 

Ia Note that by assumption this indexed bond does not pay out physical units of 
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Other definitions of interest rates and a comparison with Lucas 
(1982) are contained in Section V. 


III. Real Balances, the Price Level, and Interest 
Rates 

In this section 1 shall examine in some detail how the price level and 
nominal and real interest rates are determined as functions of the rate 
of expansion of money supply and in output/income (I shall leisurely 
refer to y as both output and income). 

I shall solve the equations (9a)—(9c) and (9e) and find the functions 
m(i), \(s), 11 ,( 5 ), and tt(a, M). (Note in passing that this can be done 
independently of eq. [9d].) These functions by (12) and (13) give the 
nominal and real interest rates t(s) and p(s). The price of money will 
depend on both the state and the stock of money, whereas the other 
variables will depend on the state only. The pricing function (9e) for 
money can be interpreted as a demand-price function for money. 
Hence, we are indeed about to derive a proper demand function for 
money. The equilibrium real price of money, and hence the nominal 
price level P = 1/ir, are then determined by the interaction between 
the demand lor and supply of money, in the usual way. 13 

I will examine in some detail the demand for money as a function 
o t temporary disturbances in income y and (the gross rate of) monetary 
expansion w. 1 do this by assuming that the probability distribution of 
'm i = (y, 1 1 , u>,+ ,) is independent of the realization of s,; that is, F(s'; s) 
- F(s'). Then, for a given probability distribution, I can interpret the 
current realization of the state as a temporary disturbance, in the 
sense that it does not change the probability distribution of future 
stales. A permanent disturbance, on the other hand, will affect future 
probability distributions, as further specified below. I will report some 
results for the case of permanent disturbances, but not go into detail. 


the consumption good at the end of the next period. If it did, the real interest would be 
given by 1/(1 + p) = PE[u,(y')]/X, where the marginal utility of consumption u r (y') 
replaces the marginal utility of wealth. The behavior of that interest rate can of course 
also be examined. However, since we are considering a monetary economy, we restrict 
the analysis to an indexed bond that pays cask to an amount equivalent to one unit of 
real wealth, that is, paying 1/rr’ = P' units of money, at the end of the next period. 

15 Note that m, = itpVf, is beginning-of-period real balances, the money stock in the 
beginning of the period deflated by the good price. We can also consider end-of-period 
real balances, defined as m, = , ,, say. Of course, the two definitions of real bal¬ 

ances are related by A, = ui,tn,. The purpose of choosing a concept of real balances is to 
help understand the determinants of the demand for money and the price level. Which 
concept of real balances we choose to work with is purely a matter of convenience. We 
choose beginning-of-period real balances, since they have the advantage of varying 
directly with the real price of money, that is, inversely with the price level (recall that Af, 
the beginning-of-period money stock, is predetermined). 
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If states are independent, it follows that the term £[u r (/)m(j'); 1 ] on 
the right-hand side of (9c), which is expected total marginal utility of 
real balances, is a constant independent of s. 

To characterize the solution, note that there will generally be two 
regions in (y, o>)-space, one in which the liquidity constraint is not 
binding (m> y and p = 0) and one in which it is binding (m = y and p 
> 0). The border line between the two regions is given by the set of (y, 
id) fulfilling fj-(s) = 0 and m(s) = y, which by (9a) and (9b) gives 

<u = ^ ■ " <I>(y), (14) 

uXy)y 

where A = £[u f (y')m(j')]. When ( y , id) fulfill to < ib(y) the liquidity 
constraint will not bind, whereas it will be binding for o> ^ w(y). M 
Intuitively, we can understand this by noting that in the beginning of 
the current period a high current realization of monetary expansion 
means that more money will be distributed at the end of the period 
and that the future value of money it' = m'/M' will be lower in all 
future states since M' = idM is higher. Therefore it will be less attrac¬ 
tive to hold money and more attractive to spend money on consump¬ 
tion, which will bid up the money price of consumption goods in the 
beginning of the current period. This lowers the current value of 
money and current real balances. Hence, for sufficiently large mone¬ 
tary expansion real balances will fall to hit the liquidity constraint. 

The explicit solution in the region below the border line, when u> < 
w(y), is 

m{s) = —1> y, Ms) = u c (y), p(.s) = 0; (15a) 

the solution above the border line, when u> 5: u>(y), is 

m(s) = y, X(i) = p(j) = u r (y) - 2 = 0. (15b) 

yw yu> 

For small to, when m > y, real balances by (15a) fall with increasing to, 
to hit the liquidity constraint eventually. Then real balances cannot 
fall further, and instead the marginal utility of real balances p in¬ 
creases with to by (15b). 

The elasticity of the borderline function to(y) is eto/ev = r(y) - 1, 
where r(y) = -tu r /ey = —yu cc /u c is the Arrow-Pratt measure of rela- 


14 This can be shown as follows: Suppose 10 = <•>(>). where m = y. p “ 0. and A = u r . 
(i) Let (a increase for constant y. Suppose p remains equal to zero and hence A remains 
constant equal to u,. Then, by (16) m must fall. But then m < y, which is impossible. 
Hence p must increase and be positive for u > <o(y). (ii) Instead let w decrease below 
w(y), lor constant y. Assume m = y. By (16) A must rise. But by p 2 0, A £ My), which is 
a contradiction. Hence m > y for <■> < «(y). 
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*s Hence the slope of the border line 


five I consumption) risk aversion. - . . , 

the region depend, on whether the relative risk aversion is 

greater or smaller than unity. 

The solution is illustrated in figure ) for the case with constant 
relative risk aversion. The border line, the marginal utilityof real 
balances jjl, and the income velocity of money y/m are plotted. Panels 
a-c in figure 1 correspond to the three cases 0 < r < 1, r = 1, and r > 

1, respectively. The border line has a negative, zero, and positive 
slope in the three cases, respectively. (Panel c is drawn for r - 2. If r < 

2, the border line is concave downward; if r > 2 it is convex down- 
watd.) Below the border line, velocity is rising in monetary expansion 
up to its maximum—unity—and above the border line velocity equals 
unity. The marginal utility of real balances is zero below the border 
line, and it is rising in monetary expansion above the border line. The 
isovalue curves for monetary velocity and marginal utility of real bal¬ 
ances are also plotted. 17 

We see in figure 1 that below the border line velocity is increasing 
or decreasing in income depending on whether the degree of risk 
aversion is less than or larger than unity, respectively. To understand 
(his result, recall from (15a) that when the liquidity constraint is not 
binding, marginal utility of wealth equals the marginal utility of con 
sumption. Assume that the degree of risk aversion is less than unity 
This means that marginal utility of consumption and of wealth de 
creases less than proportionately to an increase in income. By (9c) rea 
balances vary inversely with the marginal utility of wealth. Hence, real 
balances increase less than proportionately to income, and indeed 
velocity increases. Similarly, with a degree of risk aversion larger than 
unity, marginal utility of wealth and real balances vary more than 
proportionately to income, and velocity falls with income. 1 ” 


' 1 As is well known, with an additive intertemporal utility function (2) the degree ot 
risk aversion is inversely related to the degree of intertemporal substitution in con¬ 
sumption. 

With constant relative risk aversion wc have u(r) = r' 7(1 - r) for r # 1 and u(r) = 
log c tot r = 1 Then the equation tor the border is u> = u>(y) ■ $Ay' The solution is 
then, for in < u>(y), w(t) = 0AyVio > v, X(v> = v and p(s) = (); and for w a <b(y), m(s) = 
y, A(\) = (3A/yw, and p(.i) — y r — (3/1/yoi s 0. 1 he income velocity of money is y/m = 
uty’^/p/l for oi < (3 Ay’ 1 and y/m = I (or to i (3Ay' ~ 1 

1 • The isovalue curves for monetary velocity f uihll the equation to = f)Ay’ “ ‘(y/m) for 
constant y/m, and they are of the same form as the equation for the border line The 
isovalue curves for marginal utility of real balance fulfill the equation to = (3Ay’ '/(I - 
py'l for constant p. 

Let us again emphasize that the monetary velocity y/m is defined using beginning- 
of-jjeriod real balances. We can also look at end-of-penod monetary velocity y/m = 
y/tom. As mentioned, the advantage of using beginning-of-period real balances is the 
direct relation to the price level as given by (15). The advantage ot defining monetary 
velocity as y/m is the ease with which the solution can be illustrated and interpreted. 
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9 The price level P(s, A?) = 1M*. M) = M varies inversely with 
1 nt puce ic v . . p/p 0 f course varies with the price 

real balances m(s). Past inflation Ell'- \ 01 cuui r 

level P. Future inflation fulfills 


P' _ jr _ /jg\ 

y " it' m(j') ’ 


where it' = m'/M' = m'/wAf. Hence future inflation varies, for given 
next-period state s', directly with end-of-period real balances m = wro. 
The same is the case with expected inflation. 


E\ 


) * £ (?) “ “H?)' 


(17) 


since the term E(lim') is constant. We see from (15a) that when the 
liquidity constraint is not binding, expected inflation varies with 
1/a,(y) and hence is independent of monetary expansion and is in¬ 
creasing in income. When the liquidity constraint is binding, it is 
increasing in both monetary expansion and income. That expected 
inflation is increasing in income is natural: a temporary increase in 
income temporarily lowers the current price level, which increases 
inflation. That a temporary increase in monetary expansion increases 
expected inflation when the liquidity constraint binds also seems intu¬ 
itive: next period’s price level increases across the board, whereas 
current real balances and the current price level are independent of 
current monetary expansion. Why is expected inflation independent 
of monetary expansion when the liquidity constraint is not binding? 
The reason is that current real balances fall, and hence the current 
price level rises, in proportion to monetary expansion. Hence the 
current price level rises in proportion to next period’s price level, and 
expected inflation remains unaffected. 

Let us also see how nominal and real interest rates vary with tempo¬ 
rary disturbances in monetary expansion and income. By substituting 
tt' = m'/M' - m'luiM in (12) and simplifying, we get that the nominal 
interest rale fulfills 


i(s) 


E(p'm') 
E(\’m') ’ 


(18) 


which is constant and independent of current monetary expansion 
and income. That the nominal interest rate is independent of, rather 
than increasing in, a temporary monetary disturbance may appear 
somewhat surprising. How can we understand this result? One way is 
to recall from the discussion of (12) that the nominal interest rate 
compensates for the absence of liquidity services of nominal bonds. 
The relation between next period’s liquidity services (p/) and next 
period’s marginal utility of wealth (X') depends on next period’s state 
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only, and not on the current monetary expansion. A higher current 
monetary expansion lowers the value of money it' in all states, but it 
does not affect the relative attractiveness of nominal bonds and 
money. 

What about the real interest rate, given by p = \/p£(X’) - 1 in (13)? 
The term E(\'), the expected marginal utility of wealth, is indepen¬ 
dent of current monetary expansion and income, and the real interest 
rate varies with current marginal utility of wealth. By (15a) it follows 
that below the border line it varies with the marginal utility of con¬ 
sumption and hence is independent of monetary expansion and de¬ 
creasing in income. Above the border line, that is, when the liquidity 
constraint is binding, it is decreasing in both monetary expansion and 
income. 

Summing up, for a temporary increase in monetary expansion, real 
balances fall and the price level rises when the liquidity constraint is 
not binding, expected inflation rises when the liquidity constraint 
binds, the nominal interest rate remains constant, and the real inter¬ 
est rate falls when the liquidity constraint binds. This means that a 
temporary disturbance in monetary expansion has a direct effect on 
real balances and the price level, an effect that is not captured by any 
change in the nominal interest rate. 19 Note that a proper demand 
function for money may need current monetary expansion as an 
independent argument, separately from current income and the 
nominal interest rate. 

Permanent disturbances in monetary expansion and income can be 
modeled by introducing positive serial correlation. This is done in 
detail in Svensson (1983). Here 1 report only some of the results. 

The dependence of monetary velocity on permanent disturbances 
in monetary expansion and income can be illustrated as in figure 1. 
except that the border line and the isovalue curves for monetary 
velocity and marginal utility of real balances are flatter. In particular, 
below the border line (when the liquidity constraint is not binding), 
and relative to temporary disturbances, monetary velocity is more sen¬ 
sitive to permanent disturbances in monetary expansion and less sen¬ 
sitive to permanent disturbances in income. This is intuitive: a perma¬ 
nent increase in monetary expansion will increase next period’s price 
level more than a temporary increase in monetary expansion. This 
means that money becomes less attractive to hold, which in equilib¬ 
rium will bid up the current price level more, and hence depress 
current real balances further, making these more sensitive to perma¬ 
nent disturbances in monetary expansion. With regard to income 

111 Note that this is true also tor the demand tor end-of-period real balances, although 
in opposite regions in (y, to)—space: for m > y, rh = (3Av\ whereas for m = v, m = wv. 
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OTl 

draiiriuiMn. file /latter jsovaiue curves for monetary velocity when 
disturbances are permanent imply that monetae y velocity is less sensi¬ 
tive to permanent income disturbances than to temporary distur¬ 
bances. That is, the elasticity of real balances with respect to income is 
closer to unity with permanent disturbances. A current temporary 
increase in income decreases the marginal utility of wealth (X), requir¬ 
ing real balances to increase. A permanent increase in income in 
addition decreases next period’s marginal utility of wealth, which de¬ 
presses the expected total marginal utility of next period’s real bal- 
.trues. Ihe net effect is to stabilize the income velocity of money. 

I bus, the price level and past inflation are more sensitive to perma¬ 
nent monetary expansion than temporary. They are closer to moving 

proportionally with permanent income changes than with temporary 
ones. r 7 


Ihe nominal interest rate is increasing in permanent monetary 
expansion. Intuitively, recalling (12), a permanent increase in mone- 
Ut v expansion will decrease next period’s expected marginal utility of 
W , , . iin< lleme " Klei,se r,ext period’s expected marginal utility of 
tea a/ance.s, making money relatively more attractive than bonds, 
o compensate for that, the nominal interest rate must rise. The real 

Ihe bm'd'/I’r lS m 7 eaS,,,g pcrmancm monetary expansion below 
ionic i line, whereas tt seems that it can be either decreasing or 

above the border line, A pe, mane,,. increase JJJ ' 
expansion, m contras, to a temporary increase, decreases next ne- 

:i;rx::r ,nal i,,,l " v " r -.■ **« » 


IV. The Fisher Relation and the Premium on 
Nominal Bonds 

cmentlv T lkat ' 0, | ° f previous resll,ls - 1 s »«ll examine some fre- 

Kouri tTT 1 P; ! r “ y ( C0, , ,d,tl0,1S (see ’ e ’K” Ro11 Solnik 1979; 

" L HA) - Ltt us flrsl f <><* at the Fisher relation, between nornina 
a d real .merest rates and the expected rate of inflatSn By the 
ddmjt ons of nominal and real interest rates given in (12) and (IT) we 
can write the relation between nominal and teal inre es rates a (alt 
expectattons m this section are conditional on the current st ,!e)1 + 
0/(1 + p) = myWWEW. T he right-hand silTone the 
probability and marginal-utility-of-wealth-weighted (gross) real rate 
of apprecatton of money. According to the simple fTsher relation h 
s nu equa the expected (gross) rate of inflation, E[P'/P) = EfTt/tr') 

"'“ d »«t>or s the simple Fisher re at „n does hold 

■mder nncertatnty. The relation above can be rewritten “g, ™ 
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1 + P , cov(X',-tr'/ir) 

W) + W) 

Hence we see that the sign of cov(X\ tt'/tt), the covariance between the 
marginal utility of wealth and the rate of appreciation of money, is 
crucial to whether the left-hand side of (19) exceeds or falls short of 
l/jEff'rr'/'rr), one over the expected real rate of appreciation of money. 20 

Another frequently discussed parity relation is that between the 
expected real return on nominal bonds and the real rate of interest on 
indexed bonds. More precisely, the issue is whether there is a risk 
premium on nominal bonds relative to indexed bonds. Let us define 
the expected real return on nominal bonds as 21 

£) - 1 - ( 2 °) 

We say that there is a risk premium on nominal bonds if the ex¬ 
pected real rate of return on nominal bonds exceeds the real interest 
rate on an indexed bond. Using (20) and (13) we can write 22 




( 21 ) 


We note that real wealth nsk neutrality, meaning that the marginal utility of wealth X' 
is independent of the slate ,t', would imply that the rovariancc between the marginal 
utility of wealth and the real rate ol appreciation ol money is zero, and hence that the 
right-hand side of (19) equals one over the expected real rate ol appreciation ol money 
We also note in passing that the distinction between marginal utility of wealth and 
marginal utility of consumption makes it necessary to distinguish wealth risk behavior 
horn consumption risk behavior. Put differently, the concavity of the value function in 
wealth is relevant, in addition to the concavity ol the direct utility function in consump¬ 
tion. To express the Fisher relation explicitly in terms of the rate of inflation, we can 
rewrite the right-hand side of (19) to get (I + i)/(l + p) = £(£'/£) + cov(XV, 
P'IP)IE(\’it'). Hence, whether (I + i)/(l + p) exceeds or falls short of the expected rate 
ol inflation depends on cov(X'it', P'lP), the covariance between the marginal utility of 
nominal wealth (X'it') and the rate ol inflation Here we note that nominal wealth risk 
neutrality, meaning that the marginal utility of nominal wealth X'it' is independent ol 
the state s', would imply that the right-hand side is indeed equal to the expected rate of 
inflation. 

21 We can understand (20) as follows: One unit of money invested m nominal bonds 
at the end ol period t pays 1 + i units of money at the end of period t + 1. The real 
value of this is (l + t)it', ex post at < + l. The real value of one unit of money at the end 
of period t is 1/rr. Hence, the real ex post rate ol return is |(1 + »)trVir] - 1. The 
expected real rate of return is then given by (20). 

2 We gel (1 + /?)/(1 + p) = EX’Eti'/E(\’’n’), from which H - p = (1 + p)(£X'£ir’ - 
= -(1 + p)cov(X', Tr'/it)/£(X’'TT’/'ir). 
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Hence, whether or not there is a risk premium on nominal bonds 
depends on the sign of the covariance between the marginal utility of 
wealth and the real rate of appreciation of money, the same 
covariance that appears in the Fisher relation (19). Of course, the 
issue of a risk premium on nominal bonds is connected with the 
Fisher relation: if there is a negative correlation between marginal 
utility of wealth and the real appreciation of money, nominal bonds 
become a relatively less attractive asset for a risk-averse consumer 
than an indexed bond, which in equilibrium requires a higher nomi¬ 
nal rate of interest relative to the real rate of interest. 

These relations have been discussed previously in the literature. 
What is new here is the following two circumstances. First, the mar¬ 
ginal utility of wealth is not identical to the marginal utility of con¬ 
sumption, the difference being the marginal utility of real balances. 

1 his marginal utility of real balances is not postulated by having real 
balances in the utility function, but carefully derived as the endoge¬ 
nous shadow price of a liquidity constraint. Second, the variables, in 
particular the prices and interest rates, that enter these relations are 
the outcome of a general equilibrium, where only monetary supply 
and output are exogenous. Previous literature has taken at least some 
prices and interest rates as given, and is hence partial equilibrium, 
with the exception of Kouri (1983), which relies on the quantity equa¬ 
tion. 

Since the variables in the relations above are endogenous functions 
of the exogenous stochastic income and monetary expansion, we can 
go behind the relations and see how properties of income and mone¬ 
tary expansion affect the Fisher relation and the premium on nomi¬ 
nal bonds. Let us limit the discussion to the case when income and 
monetary expansion are serially independent, that is, when the solu¬ 
tion (15) applies.'’’ Since tr'/'n = m'/uim, we have cov(X', tt'/tt) = 
cov(X', m')/cow, and we can look at the covariance between marginal 
utility of wealth and real balances. Here X' = X(y', u>') and m' = m(y', 
co') are both functions of income and monetary expansion. In general 
the covariance between X' and m' will depend on the first-order par¬ 
tial of these functions together with the variances and covariances of 
y' and to'. It will also depend on the second-order partials and the 
third- and fourth-order moments of y' and u )’,' H Let us restrict the 


Note then that marginal utility erf real wealth, Ms). and marginal utility of nominal 
wealth. X(r)ir(r) = Ms)m(s)IM, are both nonconstant functions of the state s. Hence, what 
we have called real and nominal risk neutrality does not occur here (see n. 20). 

24 Let the vector x be stochastic and let /(x) and g(x) be two real-valued functions. 
Then cov|/(x), g(x)J is given by the formidable expression f,E[(x - £x)(x - £x)']g, + 
/.£l(x - £x)(x - £x)'g„(x - £x)J /2 + gi£[(x - £x)(x - £x)'/„(x - £x )]/2 + £[(x 
- £*)'/**(* - £x)(x - £x)'g„(x - £x)J/4, where/, and denote the gradient and 
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discussion to only first-order partials and second-order moments. 
(This is equivalent to considering linear approximations of X[/, <n'] 
and m[y', to'].) Then, from (15) we see that X' is decreasing in / and m' 
is increasing in/, in both regions. Hence, the covariance between X’ 
and -it' is decreasing in the variance of /. In region m' > /, X' is 
independent of to' and m‘ is decreasing in to'. In region m' = /, X' is 
decreasing in to' and m' is independent of to'. It follows that cov(X, 
'tt'/it) may be either increasing or decreasing in the covariance be¬ 
tween / and to', and hence the latter has an ambiguous influence. It 
also follows that the variance of to' has no effect. 25 

We conclude that a large variance in income contributes to the ratio 
(1 + *')/( 1 + p) being larger than one over the expected real rate of 
appreciation of money—l/f^ir'/tr)—and to a premium on nominal 
bonds relative to indexed bonds. The covariance between income and 
monetary expansion has an ambiguous effect. 20 

Let us also look at the premium on bonds in Lucas’s (1982) mone¬ 
tary model. Lucas does not himself examine the parity condition, but 
it is not difficult to do that in his model. As shown in Section V, in 
Lucas’s model the relevant marginal utility of wealth equals the mar¬ 
ginal utility of consumption, tt,(y). Also, the quantity equation holds, 
irM = y. Finally, monetary transfers are distributed in the beginning 
of each period and included in the beginning-of-period money sup¬ 
ply, M' = (o'Af. Hence, we have ir'Ajr = (y'/M')/(y/M) = //to'y, and 
the covariance crucial to the premium on bonds can be written 
cov[u,(/),//co']/y. Since u c is decreasing iny, this covariance is decreas¬ 
ing in the variance of/. Since //to' is decreasing in u>' and increasing 
in /, it follows that this covariance is decreasing in the covariance 
between / and to'. Hence, a large variance of / as well as large 
positive covariance between / and w' contribute toward a premium 
on nominal bonds. In Lucas’s model the correlation between / and to' 
does not have an ambiguous effect on the premium on nominal 
bonds, in contrast to our case. This is due to the simplification caused 


Hessian ot j(x) (evaluated at Ex.), respectively, etc., where all nonprimed vectors are 
column vectors, and a prime denotes transpose. 

** Consider the linear approximations X' = EX' + a(y' - Ey') + b(u>' - £ui ) and m' 
= Em' + c(y' - Ey') + d(u>' - £«’). Then it is easy to show that cov(\\ m ‘) = ar var(y') 
+ (ad + fcc)cov(/, id') + bd var(u)'). 

m With regard to the covariance between marginal utility of nominal wealth and the 
rate of inflation, we have cov(A’n', P'/P) = PA cov(l/io', l/m’). by (15). Again we 
disregard the effect of third- and fourth-order moments: l/a>' is obviously decreasing 
in w'; 1/m' is increasing in iu' since m’ is decreasing, in the region m' > y'. Hence the 
covariance between marginal utility of nominal wealth and inflation is decreasing in the 
variance of in’; and 1/m’ is decreasing in y', since m' is increasing. Thus, the covariance 
between \'ir’ and P'lP is increasing in the covariance between y’ and id'. Variance in y’ 
has no effect on it. 
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by an always binding liquidity constraint. Kouri (1983) derives the 
same result as we have shown for Lucas s model. 

V. The Timing of Information and Relations to 
Lucas (1982)** 

In Lucas (1982) the asset market is open after the state is known and 
he (ore the consumption goods market opens. In my model, the asset 
market is open before the state is known, and there are no possibilities 
to adjust money holdings after the state is known and before going to 
the goods market. Lucas (1982) restricts the discussion to equilibria 
where the liquidity constraint is binding, which occurs whenever 
nominal interest rates in his model are positive. In my model 1 also 
consider equilibria when the liquidity constraint is not binding, which 
m this model is consistent with positive interest rates. In this section I 
shall discuss these issues and clarify the difference between Lucas’s 
case and mine. 

It may appear that with Lucas’s timing all equilibria have binding 
liquidity constraints (this is erroneously argued in Svensson [1988]). 
Indeed, the behavior ol real balances and, in particular, whether or 
not the liquidity constraint is binding turns out to be independent of 
whether the asset market opens before or after the stale is known. 
Hence, equilibria with nonbinding liquidity constraints can occur in 
Lucas's model, although they come with zero nominal interest rates, 
given his definition of the latter. Since it may seem nonrestrictive to 
assume positive nominal interest rates, it appears equally nonrestric- 
live in Lucas’s model to consider only binding liquidity constraints 
and hence the quantity equation. 

To sort this out, let us assume that there exists in my model an asset 
market in the beginning of each period, after the period’s state is 
known, in addition to the already existing asset market at the end of 
the period. We assume that shares and money can be traded on this 
beginning-ot-period asset market, before the goods market opens. As 
before, we assume that dividends on shares and money transfers are 
paid at the end-of-period asset market. (Below 1 shall also discuss 
changing the timing of money transfers.) Let A/, and z, denote the 
quantity of money and shares held when the beginning-of-period 

In a somewhat dillerent model, Komi (1983) derives the resull that the premium 
on bonds depends positively on die covariance between income and the price of money , 
m oui notation cov(y\ it’) 1 his is the same resull as what we have dcuved above for tbe 
l.liras model. Since Kouri assumes the quantity equation, it follows also in his model 
that iik leased variance in income and increased positive covariance between income 
and monetary expansion contribute to a premium on nominal bonds 

Jtl I his sec tion owes much to suggestions by and discussions with Robert E. Lucas, Jr. 
Salver (198*1) has independently arrived at similar results. 
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asset market closes and let Qj = Q(s„ M,) be the nominal price of 
shares on that asset market. Then the relevant budget and liquidity 
constraints are 

M, + Q ( z, s M, + Q,z„ (2‘2a) 

Pfi, ^ M„ (22b) 

and 

M l + i + Q,z,+ \ s (M, - Ptc t ) + (Q, 4- P,y t ) z, + (w, - 1 )M„ (22c) 

where (22a) is the beginning-of-period and (22c) the end-of-period 
budget constraint. The market equilibrium conditions are now 

c, ~ y„ M, = M, = M„ z, - i, = 1. (23) 

These combined with the first-order conditions for maximizing (7) 
subject to (22) and (5c) give the equilibrium equations. These equilib¬ 
rium equations turn out to be (9a)—(9d) and 

v = u r (y) (24a) 

and 

vq = X.( q + >'), (24 b) 

where v is the Lagrange multiplier corresponding to (22a) and q is 
Q/P. Hence the equilibrium with this added beginning-of-period asset 
market is identical to the equilibrium without this asset market. The 
added equations (24) say that v, the marginal utility of beginning-of- 
period wealth, w = tt M + qz, equals the marginal utility of consump¬ 
tion, and that the beginning-of-period real share price q equals the 
end-of-period marginal utility of the total return on shares, k(q + y). 
divided by the beginning-of-period marginal utility of wealth. In gen¬ 
eral this beginning-of-period share price q will of course differ from 
the end-of-period share price q. 

The real balance function m(s) and hence the price-level function 
P(.s, M) would look exactly the same ifthe end-of-period asset market 
were deleted from the model—as long as money transfers are distrib¬ 
uted at the end of the period and cannot be used to buy consumption 
goods during the period. In Lucas’s (1982) model, money transfers 
are, however, distributed on the beginning-of-period asset market 
and included in the case balances available for purchases on the goods 
market. It can be shown that this indeed gives a different real-balance 
function. For instance, temporary monetary disturbances have no 
effects then on real interest rates. This makes sense: monetary distur¬ 
bances are now instantaneous helicopter drops, and serially indepen¬ 
dent shocks raise the price level in proportion to the money supply. 
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When monetary transfers are distributed at the end of the period but 
are known in the beginning, they are in a sense anticipated and then 
they have real effects. However, also with monetary transfer distrib¬ 
uted in the beginning of the period, the liquidity constraint may not 
be binding in some states. 

I.et us finally consider the different definitions of nominal interest 
rates. 1 lie nominal interest rate discussed in previous sections refers 
10 end-of-period nominal bonds, that is, one-period nominal bonds 
that are bought on the end-of-period asset market in one period and 
mature at the end-of-period asset market next period. We can also 
consider beginning-of-period nominal bonds that are bought on the 
beginning-of-period asset market and mature on the beginning-of- 
peiiod asset market next period. I'he corresponding nominal interest 
rate I is defined by 


_L_ = (25 i 

1 + i U,TT ' ^ > 

Here, the beginning-of-period expected utility of a bond that pays 
one unit ol cash next beginning of period is (JEfv'-ir') = {JEfu/it'). Its 
real present value is (3E(N,’-ir’)/v = and its nominal pres¬ 

ent value is hence (JE(u,'ir')/u,ir, which equals 1/(1 + t). Combining 
(2 n), (9b), and (9c) gives 


f(s) 


pf>) 
Ms)' 


(26) 


1 his beginning-of-period interest rate is the one considered in Lucas 
(1982) (and in Townsend [1982]), and it is directly related to whether 
the liquidity constraint is binding or not, via the shadow price of the 
liquidity constraint, the marginal utility of real balances. Positive be- 
ginning-of-period nominal interest rates imply the quantity equation 
and unitary income velocity of money, and a nonbinding liquidity 
constraint and variable velocity implies zero interest rates. 

Also note that beginning-of-period and end-of-period bonds have 
different degrees of liquidity, in the sense that the cash received when 
a bond matures in one case can (in another cannot) directly purchase 
consumption goods. Similarly, we can consider at least two different 
real interest rates, depending on the precise characteristics of the 
corresponding indexed bonds. In general, in order to price an asset, 
all relevant characteristics of the asset have to be sufficiently specified. 
I hese include when the asset is traded and when and how its divi¬ 
dends are paid. 
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VI. Conclusions, Limitations, and Possible 
Extensions 

This paper derives a demand for money in a general equilibrium 
asset-pricing framework with stochastic output and monetary expan¬ 
sion, where the liquidity services of money are endogenously deter¬ 
mined via a cash-in-advance constraint. In particular this framework 
makes it possible to express the demand for real balances, and the 
equilibrium price level, rate of inflation, and nominal and real interest 
rates, as explicit and simple functions of temporary and permanent 
disturbances in output and monetary expansion. 

What do we learn from this? From a more methodological point of 
view we see that the cash-in-advance approach can be extended rela¬ 
tively easily to a general equilibrium asset-pricing framework and 
need not be restricted to give a unitary income velocity of money. The 
cash-in-advance approach then is a convenient way to model the cir¬ 
cumstance that cash is more liquid than nonmoney wealth. Although 
for some problems it may make little difference whether money is 
introduced via a cash-in-advance constraint or directly in the utility 
function, it seems that the cash-in-advance approach offers some dis¬ 
tinct advantages. Not only does it model more specifically the transac¬ 
tions role of money, it does not need ad hoc assumptions about cross 
partials of the utility function—frequent in the money-in-the-utility- 
function literature—and it conveniently represents the difference in 
liquidity between cash and other wealth. From a purely analytical 
point of view, it also seems easier to use. 29 

The simplicity of the explicit solution of the model allows for a 
variety of applications. One application is a more systematic study of 
the much-discussed correlation between inflation and rates of return 
on different assets. The various correlations between endogenous 
inflation and rates of return will depend crucially on the nature of the 
exogenous disturbances, that is, whether in output or in monetary 
expansion, and temporary or permanent, as shown in Nakibullah 
(1984). Another obvious extension is to international finance issues, 
as in Lucas (1982). Some international issues are discussed in Stock- 
man and Svensson (1985) and Svensson (1985a). 

29 LeRoy (1984a, 1984i), who uses money in the utility function, solves the general 
equilibrium system with either output or monetary expansion deterministic, and with 
the utility function additively separable between consumption and real balances 
Danthine and Donaldson (1983) solve the system with monetary expansion determin¬ 
istic. I have tried to solve for the equilibrium with money in a nonseparabte utility 
function and both output and money expansion stochastic, but I have not been able to 
find a solution. Other aspects of the difference between the money-in-the-uiiliiy- 
function and cash-in-advance approaches are discussed in Svensson (1983). 
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Optimal monetary policy can to some extent be discussed in the 
present model (see Svensson 1983). Extending the model to a richer 
menu of monetary and fiscal policy, by introducing government ex¬ 
penditure on goods, outstanding government debt, and proportional 
taxation (e g., as in Lucas and Stokey 1983), also seems a suitable area 
tor future research. It can be argued that asset-pricing models with a 
flexible price level exaggerate the variability of the price level. In 
Svensson (198 !>h) some consequences of a sticky price level are exam¬ 
ined. 

The results ori the Fisher relation and the premium on nominal 
bonds show the possibilities in a general equilibrium model to relate 
these relations between endogenous variables to properties of the 
exogenous stochastic processes. This is not specific to the cash-in¬ 
advance approach, of course. What it contributes is, except the sim¬ 
plicity of the explicit solution, the distinction between marginal utility 
of wealth and of consumption, and the insight that the relevant at¬ 
titude to risk is with respect to wealth rather than to consumption. 

I'he general limitations of this kind of general equilibrium analysis 
with a pure exchange economy are discussed by Lucas (1982) and are 
now well known. In particular the assumption of identical consumers 
and absence of physical investment, and the reliance on a "perfectly 
pooled" stationary equilibrium, are serious restrictions. They imply 
that portfolios are never revised, and in equilibrium consumption and 
utility are independent of monetary expansion. Relative prices are 
af fected by monetary polity, though, and the idea is of course that in 
a richet model, say with heterogeneous consumers, there would be 
teal effects on consumption and welfare of monetary policy. The 
cash-in-advance approach so far has the additional limitation of treat¬ 
ing the transactions period as exogenous. Also, only goods transac¬ 
tions are assumed to absorb cash. If also asset transactions absorb 
cash, the demand for real balances is modified. 'This is another area 
for future research. 
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Copying and Indirect Appropriability : 
Photocopying of Journals 


S. J. Liebowitz 

University of Rochester and University of Chicago 


Creators and owners of intellectual properties are alarmed by the 
growth of technologies that ease the task of copying these properties. 
This paper, however, shows that the unauthorized copying of intel¬ 
lectual properties need not be harmful and actually may be 
beneficial. The empirical impact of photocopying on publishers of 
journals is examined in an attempt to discover if publishers can 
indirectly appropriate revenues from users who are not original pur¬ 
chasers. The evidence indicates that publishers can indirectly appro¬ 
priate revenues from users who do not directly purchase journals 
and that photocopying has not harmed journal publishers. 


It has become dramatically easier to make copies of printed materials 
since the introduction of the Xerox 914 copier in 1959. Since that 
time, users of printed materials have been busily making copies of 
these materials with this and subsequently introduced copying ma¬ 
chines, conveniently and at low cost. The copies are generally not of 
as high a quality as the printed originals, but they are often much 
more convenient. The producers of copyrighted printed materials see 
this use of their product as an infringement of their property rights 
and, more important, as a drain on demand and revenues. But the 
feasibility of photocopying has two other important effects not gener¬ 
ally acknowledged: (1) because the materials can be inexpensively 
copied, there is an increased demand for them as copyable originals 
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91° ran be indirectly appropriated by copy- 

( , e .. the demand of cop.e--* i ^ of(he copyrighted good may be 

right owners), and .) tnt , wo effects, photocopying need 

dramatically altered- Betdus revenues of copyright holders. 

..... h „C „ de,ri,„e,.,al on journal pub- 

1,, .hi. paper 1 «a„„„e ph«> «*™PV""S”™ , hese are opera- 

iishers with the conclusions * 

and ,h,„ photocopying ha, had a salutary effect on the 
ptofitabiiity ot publishers of photocopied materials. The debate and 
litigation between owners and users of copyrighted materials, there¬ 
fore, may be misplaced.' 


I. Copyright Law and Appropriability 

Copyright gives authors certain property rights over their intellectual 
creations, the most important being the sole right to reproduce or 
publish the work. It is often suggested that without such a right au¬ 
thors and publishers would not receive remuneration sufficient to 
create their intellectual works. Copyright, however, is rather narrow 
m scope, protecting merely the expression of intellectual ideas and 
allowing several authors to copyright similar or identical works inde¬ 
pendently if they were created independently. The usual term of 
copyright is the life of the author plus 50 years. There are various 
exceptions to copyright protection such as performance for charitable 
causes or copying short passages for use in schools. An important 
exception, which is most germane to academic and other researchers 
making photocopies, is known as fair use. Fair use is a defense to a 
claim ol infringement currently provided in Anglo-American copy¬ 
right law when the copying is done for purposes such as research, 
teaching, news reporting, or commentary. 1 2 The courts determine 
whether a particular action constitutes fair use, and no hard and fast 
demarcation exists. 

Although it is often suggested that copyright is required if creators 
of intellectual products are to be able to appropriate revenues from 
users of their products, copyright is only one of several possible 
methods whereby authors or publishers can appropriate revenues 
from those who use intellectual properties. Plant (1934), for example, 
claimed that being first in the market allowed authors to capture a 
large portion of the potential revenue, since he believed that it took 
considerable time for the inherent monopoly power of the first pub- 

1 Some oi these legal actions involved the Association of American Publishers in 
litigation against the American Cyanamid Corporation, the Squibb Corporation, and 
New York University. These cases were settled out of court. 

2 See Liebowitz (1984) for a more thorough discussion of fair use. copying, and the 
economics of copyright. I argue there that the fair-use doctrine was an attempt to 
identify those instances when copying would not substitute for purchase. 
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Usher to erode. Another potential form of appropriation, and the one 
to be examined here, concerns the ability of authors to appropriate 
revenues indirectly from users who do not directly pay authors for the 
right to use their creation. 


II. The Potential Impacts of Photocopying 

A copyright holder’s profits are threatened when his ability to appro¬ 
priate revenues is reduced. The substitution of copying for purchase 
has generally been viewed as decreasing the potential appropriability 
of copyright owners. Yet it is certainly not the case that direct pay¬ 
ment need be made to sellers of products in order for them to appro¬ 
priate revenues from users. The discussion of a simple analogy can be 
used to illustrate this point. 

Ford sells new cars to both private individuals and automobile 
rental companies such as Hertz. These cars are durable goods, lasting 
for many years. Often for individuals, and almost always for the car 
rental companies, the cars are resold before their useful lives are 
finished. The price that Hertz is willing to pay for a car depends not 
only on the value of a car in the car rental business but also on the 
resale value of the car when Hertz is finished renting it out. When 
Hertz buys new cars it includes the expected discounted value of the 
resale price of the car in the price it is willing to pay. The purchaser of 
the used car from Hertz does not pay anything directly to Ford, but 
Ford received indirect payment when it sold the new car in anticipa¬ 
tion of this later resale. Thus direct payment is not necessary in order 
for Ford to appropriate revenues from future users of its used prod¬ 
ucts over the useful lives of these products. 

The same type of analysis can be made for intellectual products, 
which are capable of being used over and over again regardless of 
their particular physical manifestation. The copyright owner sells a 
certain number of authorized copies, from which unauthorized 
copies are made. The users of unauthorized copies, like the buyers of 
used cars, may be indirectly paying the copyright owner for their 
unauthorized copies if the owners of authorized copies take the “re¬ 
sale” value of the authorized copies into account when they purchase 
them.' 1 Therefore, the impact of copying on copyright owner reve¬ 
nues is unclear, as are any welfare implications.' 

H The ability of a producer of durable goods (such as Ford) to appropriate revenues 
from consumers of used vehicles has been analyzed by Benjamin and Kormendi (1974). 
f.iebowitz (1982) extends this analysis and Liebowitz (1981) applies it to photocopying 
in much greater detail than is provided here. Besen (1984) also applies and further 
extends this model to the case of photocopying. These models show that photocopying 
may increase or decrease the ability of a producer to appropriate revenues. 

1 Novos and Waldman (1984) and Johnson (1985) have examined the welfare impli¬ 
cations of unauthorized copying, ignoring the complications caused by indirect appro- 
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The analogy to the used car market is weakened when there exists 
variability in the number of copies made from each original. It is 
much more difficult for the publisher to appropriate revenues from 
copiers when copies are made from some originals but not from 
others. Since the copying of originals will not have an equal impact on 
all originals when only some are copied, copying will alter the relative 
values provided by originals. In order to retain the same degree of 
appropriability in the face of this copying-induced variation in value, 
the publisher would need to be able to price discriminate among 
purchasers of originals, charging a higher price for those originals 
that would be used to make many copies. If price discrimination were 
not possible, the publisher could charge a high price, essentially allow¬ 
ing purchase only by those individuals planning to make copies and 
removing much of their surplus. Or he could charge a lower price, 
generating a larger quantity sold since both copiers and noncopiers 
would buy originals, but failing to capture much of the surplus from 
customers planning to make copies of their originals. 5 The inability of 
any owner of an original or photocopy to appropriate value from 
those making copies reduces the ability of the copyright owner to 
appropriate value deriving from his work. In the extreme, if a pre¬ 
ponderance of individuals used unauthorized copies and if the chain 
of appropriability were broken early on, appropriability could be al¬ 
most completely eliminated. 

1 he existence of cheap photocopying is also likely to alter the rela- 


pt lability Neglect ol this factor alone would have been sufficient to warrant concern 
annul the geneialily of their results, hut unfortunately both analyses also assume that 
the < nst of producing a copy is lower for the copyright owner than it is for unauthorized 
c opiers because, in Johnson's words (p 161). "producers would use the copy technology 
if it were cheaper. 1 believe that this assumption, although crucial to the welfare 
Inipl" aliens derived in ihese papers, is probably incorrect. One must compare the costs 
ot the pi oducts delivered lo consumers, not merely the manufacturing costs or else e a 
one could falsely demonstrate that welfare would be improved if only one electrical 
gene Idling plain were built for the entire world (since the production of electricity 
enjoys economies of scale, but the transmission is characterized by diseconomies of 
s< ale) | lie cost of producing an unauthorized photocopy of an article must be com¬ 
pared Willi the copyright owner s costs of producing and delivering (Zap Mail, Federal 
Express.) an authorized copy (reprint) in the same period of time , or else the cost of the 
time delay must be taken into account. Until publishers have all their articles on readily 
access'ble umiputci files die social cost of delivering an unauthorized copy appears to 
r, * . ? than that of delivering an authorized copy. 

This IS the problem that producers of prerecorded video cassettes believe they face. 

1 he purchasers of these cassettes fit into two distinct categories: (1) firms renting out 
lese cassettes to the public and (2) individuals buying the cassettes for their own use 
(although possibly lending or reselling them afterward). Producers of video cassettes 
“"if IO r ,X; ’ ll i aI P aM of the copyright law known as the First Sale Doctrine, 

which allows the rental of these cassettes. This would, the industry believes, allow two- 
tiered pricing, with home customers required to pay a lower price than video rental 
stores (see Landro 1984). The existence and implications of such a two-tiered pricing 
system for journal publishers are documented in the following pages. 
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tive demands for intellectual properties. For example, cheap copying 
will alter the work habits of intellectual property users. Those items 
(books) that are weak complements with photocopying will be re¬ 
placed by strong complements (journals ). 6 In fact, the entire network 
of intellectual communications (or scholarship production function) 
may change in response to the new reprographic technology. These 
changes in demands will be referred to as the “exposure” impact. 

III. Anatomy of Journal Photocopying 

I do not believe that anyone would deny that the number of copies 
made from each original varies. Most journals sent to individuals are 
infrequently photocopied, as they are intended primarily for the pri¬ 
vate use of the subscriber. Journals sent to libraries, on the other 
hand, are much more frequently photocopied, since their function is 
to assist the many library patrons interested in research and study. A 
library’s willingness to pay for journals should increase when photo¬ 
copying is done on the premises because the availability of photocopy¬ 
ing causes a library’s users to value the library’s journal holdings more 
highly and library funding is (almost certainly) related in some man¬ 
ner to the tastes and values of library users. As long as libraries pay 
subscription prices related to the valuation of journals by library us¬ 
ers, publishers need not be harmed by the photocopying done in 
libraries; this, of course, implies price discrimination if publishers are 
also to sell to individuals. 

In order for publishers to price discriminate successfully, they must 
be able to prevent arbitrage. It is fortunate for journal publishers that 
such discrimination is feasible under current institutional arrange¬ 
ments. Libraries do not usually buy their journals directly from pub¬ 
lishers, but instead through middlemen (e.g., Faxon). These agencies 
can be charged the higher discriminatory price since they are very 
easy to identify. The gain from using these middlemen helps deter 
libraries from trying to order journals at the lower price by disguising 
themselves as individuals. Also, the perceived risk associated with 
ordering journals at the lower price might loom very large to li¬ 
brarians, who might bear much of the cost but little of the benefit. 
This risk might appear particularly high since publishers, or their 
agents, merely have to inspect the holdings of a library to determine if 
the library were paying the institutional price (some sort of mark 
could easily be printed in institutional subscriptions to enhance detec- 


h See Liebowitz (1981) for a discussion of five studies of photocopying behavior m 
libraries, which demonstrate that journals are more frequently photocopied than are 
books. 
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lion). 7 For these reasons, therefore, journal publishers are routinely 
able to charge higher subscription prices to libraries. 

IV. Empirical Magnitudes 

The empirical work will be composed of two parts. The first part 
directly tests for the existence of indirect appropriability and photo- 
(opying's impact on price discrimination. The second portion of the 
empirical work examines the performance of publishers as photo¬ 
copying activity has grown. 

A. Testing for Indued Appropriability 

I he thrust of the empirical work is to investigate whether the value 
placed on an intellectual property by individuals using copies can be 
indirectly appropriated by the copyright owner. Since journals in li¬ 
braries are more heavily photocopied than those owned by individ¬ 
uals, indirect appropriability, if it exists, will raise the demand that 
libraries have for originals by more than it raises the demand by 
individuals. If the publisher is able to price discriminate between 
libraries and individuals, we should find photocopying increasing the 
price charged to libraries by more than it increases the price charged 
to individuals. In addition, frequently photocopied items should reg¬ 
ister target differentials than items less frequently copied. In order to 
test these propositions, data for a sample of 80 economics journals 
were collected; 8 these data included institutional and individual sub¬ 
scription prices in both 1959 and 1982, the number of citations re¬ 
ceived by each journal, the age of the journal, and the type of pub¬ 
lisher. Although none of these tests can be considered proof that 
photocopying is responsible for the price discrimination that cur¬ 
rently exists, no other explanations come readily to mind that can 
explain all these results coherently. 


Test 1 

From the discussion above it follows that heavily photocopied journals 
should have greater price differentials between libraries and individ¬ 
uals than those that are less heavily photocopied. Although data on 

7 Librarians probably do noi know whai the legal sanctions against fraudulent sub- 
si riberS would be. Nor do I. 

" This sample was taken from Liebowiu and Palmer (1984), who present evidence on 
the citations to 108 economics journals. Twenty-eight ol these journals were not avail¬ 
able in the University of Rochester library, were missing data, or could not be classified. 
The list of included journals is available on request. 
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the number of' photocopies made of various journals are unavailable, 
the number of citations journals receive is taken as a proxy for popu¬ 
larity and hence photocopying activity. 

Several factors besides photocopying intensity may influence the 
pricing of a particular journal. For example. Fry and White (1976) 
found that commercial publishers increased their price discrimina¬ 
tion during the 1969-73 period to a considerably greater extent than 
other types of publishers. The same result holds for the economics 
journals in the sample now under examination: the ratio of institu¬ 
tional price to individual price was considerably greater (though not 
statistically significant at the 95 percent level) for journals published 
by commercial publishers (2.08) as opposed to noncommercial pub¬ 
lishers (1.50). 

It also appears that journals of recent birth are higher and more 
discriminatorily priced than journals with a longer history. For ex¬ 
ample, the journals introduced after 1959 had an institutional price 
per page of $0,125 in 1983, while the journals in existence prior to 
1959 had an average 1983 institutional price per page of $0,065. The 
average ratio of institutional to individual prices was 1.82 for journals 
founded after 1959 and 1.50 for journals existing prior to 1959. In 
large part, the reason that young journals are more aggressively 
priced seems to be that most new journals are published by commer¬ 
cial publishers (65 percent), whereas most of the older journals (93 
percent) are published by noncommercial publishers. These statistics, 
while dramatically representing the evolutionary superiority of the 
discriminatory pricing strategy, also indicate possible collinearity 
problems with these two variables. 

In an attempt to examine the pricing policies of publishers with 
respect to the proxied photocopying of their journals, regressions 
were run that attempted to control for some of these influences. The 
dependent variable was the ratio of price charged to institutions (P| | B ) 
over that charged to individuals (Pino) The independent variables 
included the number of cites per page (cites) received by each of these 
journals in 1981 to articles written in 1975—79,* J a dummy variable 
(pub D) that equaled one if the journal had its price determined by a 
commercial firm, 1 " and a dummy variable (age D) that equaled one if 
the journal was in existence prior to 1959. The results are: 

= 1-29 + .0065 cites + .650 pub D, 

^ ,NU (1.99)_ (4.14) (1) 

N = 80, R* = .17; 

'' t his number was calculated as an intermediate step to those numbers reported in 
I.iebowitz and Palmer (1984, table 2, col. 1). 

0 Commercial means private, not a society or university publisher. 
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Hi i»- = J.38 + .0071 cites -+- .578 pub D - .160 age D, 

/,|Nn (2.14) _(3.36) (101) ^2) 

N = 80, K~ = .17 

(/-statistics are in parentheses). All variables have the predicted sign. 
1 he variable of paramount interest, cites, is significant at the 95 per¬ 
cent level of confidence under both specifications. This significant 
positive coefficient confirms the existence of indirect appropriability, 
l he mote intensive use of a journal in libraries allows publishers to 
raise the price charged to libraries. This test does not discriminate 
between various causes of indirect appropriability such as the lending 
of journals by libraries to readers as opposed to lending journals to 
toitiers. lhe next two tests will address this particular issue. The 
coefficient on pub D indicates that the pricing of commercial pub¬ 
lishers is significantly more discriminatory than that of noncommer¬ 
cial publishers. Age of journal seems to have little clear effect on the 
degree of price discrimination. 

l est 2 

This test examines the behavior of publishers regarding the price 
t barged to individuals versus institutions, l he sample of economics 
journals provides striking evidence for the contention that photo- 
copvmg increased the price paid by libraries for journals relative to 
the price paid by individuals. In 1959, the year the Xerox 914 was 
introduced, only three out of the 38 journals then in existence (or 8 
percent) price discriminated between institutions and individuals. In 
1983, 59 of 80 journals (74 percent) charged higher prices to libraries 
(the 1983 price differential, averaged over all journals in the sample, 
was about two-thirds of the individual price). The emergence of price 
discrimination was not limited to economics journals. Fry and White 
(1976) and Liebowit/ (1981) studied journals covering many disci¬ 
plines, and both found prices for institutions rising more rapidly than 
prices for individuals during the 1960s and 1970s. 

l he lack of price discrimination in 1959 might seem surprising 
considering that libraries have always served many users and had 
multiple uses (loans) of items. The concept of indirect appropriability, 
after all, was derived from the analysis of the lending of durable 
goods (as in the Hertz example), and one would expect the price 
charged by publishers for the originals used by many persons to be 
greater than the price charged for originals used by only one person. 
The reader should note, however, that libraries are currently sold 
books (which are only infrequently photocopied) at prices no higher 
than that charged to individuals even though the books are capable of 
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being used many times. There is a possible explanation. Compared 
with hardcover books sold to individuals, those sold to libraries are 
likely to give much lower values per reading, since individual pur¬ 
chasers of hardcover books have much higher values of the book than 
most potential readers. Otherwise, these individuals would buy the 
much cheaper paperback editions. Therefore, even if many people 
read the book in the library, the sum of their values might not be 
much dif ferent from the valuation placed on the book by individuals 
purchasing a hardcover edition. In addition, since books wear out 
with use, the number of library patrons who can read a book is limited 
by the durability of the book. Photocopying, on the other hand, allows 
an unlimited number of copies to be made from a single original and 
is capable, in principle, of raising the value of copyrighted materials 
by more than physical lending can do. Similar factors might have 
been responsible for the lack of price discrimination by journal pub¬ 
lishers prior to the era of cheap photocopying. 

It appears, therefore, that when the impact of heavy photocopying 
is excluded, library valuations of printed materials are not sufficiently 
higher than individual valuations to warrant a higher price to librar¬ 
ies. Since photocopying enhances any differences in valuations be¬ 
tween individuals and libraries (and this difference was very small in 
1959), the now common price discrimination between individuals and 
institutions can be attributed mainly to photocopying. The differen¬ 
tial pricing by publishers of books and journals conforms nicely to the 
differential photocopying practices toward each. 

Test 3 

Cheap copying should cause some people to substitute copying for 
the purchase of originals, and indirect appropriability allows pub¬ 
lishers to appropriate some of the surplus generated by this copying. 
Since no such analogous effect should occur for items not frequently 
photocopied, the valuations of library users should shift toward 
photocopiable items (journals) and away from those items less often 
copied (books). This test examines the behavior of libraries with re¬ 
spect to the book-journal trade-off. 

Trends in library expenditures for American academic libraries 
were examined for the time period 1941—81 and are reproduced in 
table 1. For almost 30 years there was no trend in the ratio of expendi¬ 
tures on books relative to periodicals, but by the late 1960s there 
appears a very dramatic downward trend in this ratio as shown in 
table 1. (Similar results are reported in Machlup and Leeson [1978], 
reproduced on p. 140 of the National Enquiry [1979].) That the shift 
did not begin until the late 1960s implies almost a 10-year lag between 
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the introduction of photocopying and its impact on the equilibrium 
u ' o u nal prices. Of course, this lag ts not surprts.ng, since n 

; Id hai taken years for the stock of photocopiers to permeate the 
economy and for libraries to respond to the altered behavior of their 
patrons. 1 ' I hese data should also dispel the notion that a major shift 
in library expenditures toward periodicals has been going on for a 
inut h Ion get period of time. 


li. Tin’ Tnfmmnncr of Journal Publishers 

Cheap copying should induce some individuals to switch from per¬ 
sonal subscriptions to the use (copying) of journals in libraries. Indi¬ 
rect appropriability allows publishers to generate some revenue from 
these former subscribers. 11 ’ The exposure effect, however, assuming it 
is positive, tends to increase the number ol both subscribers and 
copiers, thus giving no clear prediction regarding photocopying's in- 
fluence on the number of subscriptions. Further clouding the empir¬ 
ical analysis is the impact of new journals, which tends to decrease 
subscriptions to existing substitute journals. Increases in the popula¬ 
tion of potential readers should have exactly the opposite effect. It is 
tiecessarv, therefore, to examine subscriptions per journal, the num¬ 
ber and nature of journals, and the number of potential readers when 
examining the impact of photocopying on publishers of journals. 

Consider lirst the number of subscriptions per journal. Several 
studies' 1 show that both titles (estimates ranging from 2 percent to 5 
percent) and subscriptions per title (1.5—5 percent) grew during this 
period at rates that cumulatively appear similar to the growth in po¬ 
tential leadership, since annual growth of Ph.l).-level personnel was 
about 4 percent in the 1960s and 5.5 percent m the 1970s. 1 ' I'he 


“ At the end ol IdtiO, 2.000 Xerox 014 copiers were m place, t he figures lor ihe 
nr\i 6 vrais ;in* 8,760, 19,190, 32,160, 47,654, 62,092, and 56,612. Xerox's revenues 
lose Iroin about $700 million in 1966 to $1 6 billion in 1970, $2.9 billion in 1973. $4.4 
billion in 1976, and $8 2 billion in 1980. Since Xerox’s revenues are based almost 
entnelv on photocopying, its market share fell through the 1970s, and the price per 
copied page has fallen over tune, we can conclude that the growth ol the photocopying 
maikcl was gieatei even than these numbers 

‘ Publishers, therefore, in assessing the net impact would need to compare the 
subscription revenue minus the saving from not producing and mailing the journal 
originals lo these former subscribers with the indirect revenue generated by copying 
Note that the publisher saves the cost of producing originals when fonnei subscribers 
swiK h to c opvmg 

11 See Committee on Scientific and Technical Communication (COSTC) (1970). the 
discussion of the survey by the International Group of Scientific, Technical and Med¬ 
ical Publishers discussed in Asser (1978), Fry and White (1976). and National Enquiry 
(1979) 

11 This estimate of Ph.l). personnel was derived from (J.S Statistical Abstracts ■ popu¬ 
lation, I960, 1970, 1980; percentage of population completing college, 1960, 1970, 
1980. doctorates as a percentage of degrees earned, 1940-80 
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TABLE I 

Ratio of Expenditures on Books to Expenditures on Periodicals 
(U.S. Academic) 


1941-42 

3.02 

1961 

3.19 

1944-45 

3.41 

1963 

3.34 

1946-47 

3.13 

1965 

3.36 

1948-49 

3.40 

1968 

3.67 

1950-51 

3.01 

1971 

2.96 

1952-53 

3.40 

1973 

1.96 

1954-55 


1975 

1.70 

1956-57 


1977 

1.54 

1958 


1979 

1 26 

1959 

2.46 

1981 

1.13 


Soi'M V, — D«»U foi 1941-53 taken from "College and University Library Sumucv" (.ottr%t and Rtitarch Lttnan*\. 
M.uch 1943, July 1947. July 1948. April 1950. January 1952, and January 1954 Median expenditures are for type 1 
(larget) aiadrmu libraries No data are available (or 1954—58 Data for I959-H1 taken from the Bowker Annual 
(1970-82) Expenditures reler to total expenditures for all academic libraries 


National Enquiry confirms the similarity of subscription growth and 
growth in potential readership when it states (1979. p. 44): “The 
number of individual subscriptions per scholar has not changed in 
recent years.” Finally, these studies demons! rate that the size of jour¬ 
nals appears to have changed significantly. The COSTC study found 
a yearly increase in pages per journal of 7 percent, whereas Fry and 
White found yearly increases of 4 percent. These large magnitudes 
indicate that even if the subscriptions per capita did not increase, 
subscribed pages per capita almost certainly did. 

Because photocopying increased indirect appropriability and the 
number of subscriptions per capita did not fall, journal publishers 
should have found reprography beneficial to profits even if there 
were no population growth. Population growth should further en¬ 
hance these profits. Evidence on the profitability of journals can be 
gleaned from data on their establishment and discontinuance. Ac¬ 
cording to the National Enquiry (1979), the ratio of births to deaths 
during 1971-77 was six to one, hardly indicative of a bleak market. 
Fry and White (1976) point out that humanities journals established 
in the period 1969—73 accounted for 13 percent of all humanities 
subscriptions in 1973. In addition, the average number of pages per 
journal in the humanities grew by 16 percent during the period. Fry 
and White feel that this situation “is particularly interesting in view of 
the indication that journals in the humanities are in the most consis¬ 
tent financial difficulties of all disciplines” (1976, p. 34). 

These data reveal that the growth in reprography has not coincided 
with a period of decline for journals. Rather, since publishers in¬ 
troduced many new journals, positive profits presumably were ex¬ 
pected. 
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V. Conclusions and Policy Considerations 

1 have endeavored in this paper to demonstrate that the debate be¬ 
tween publishers and users regarding photocopying’s influence on 
publisher revenues has neglected important economic factors such as 
indirect appropriability, exposure effects, and price discrimination. 
Cognizance of these factors makes it clear that the determination of 
photocopytng’s impact is a complex problem that cannot be deter¬ 
mined without resorting to empirical evidence. The empirical work in 
this paper indicated that photocopying has not had a detrimental 
impact on publishers. 

Discussions of policy have generally assumed that unauthorized 
copying must be harmful to copyright owners 15 and have tried to 
weigh the harm to copyright owners from unauthorized photocopy¬ 
ing against the gain to users.The analysis of this paper alters this 
view considerably. Also altered is the concept of fair use, since with 
indirect appiopriability fair users may pay for their use of the intellec¬ 
tual product (alheit indirectly) just as other users pay. In addition, 
recent attempts promoting “clearinghouses” to make the collection of 
copyright payments more economic may be seen as possibly redun¬ 
dant or unnecessary. 17 Finally, the analysis presented here might help 
us understand the impacts of other reprographic technologies. The 
copying of other media may or may not have impacts similar to those 
found for photocopying. Only case-by-case empirical investigations of 
institutions arid markets can discover the impacts of these other forms 
of copying. 


1 ' 1 tic myopic view ran be represented by the billowing quote from the president ot 
a well-known publishing house “Uncontrolled photocopying is largely responsible for 
die deaths of two join nals which were published by the Williams and Wilkins Company, 
and it the condition is allowed to continue, many more will either go out of business or 
be published under government subsidy” (quoted in Thatcher 1978, p. 324) 

,r ' I his paper does not address the very difficult issues involved in determining the 
“optimal" copyright protection and deviations from optimal levels of appropriability. 

For example, in heatings prior to the 1976 copyright law revision, the U.S, Con¬ 
gress urged that an organization be created to simplify copyright negotiations between 
users and publishers and, presumably, to increase copyright revenue. In response to 
this, a nonprofit Copyright Clearance Center (CCC) was set up, although it has, to dale, 
failed miserably at its dual tasks ot increasing copyright revenues and decreasing trans¬ 
it! lions costs. Many copyright owners (publishers) did not even bother to enroll in the 
system, and most users (libraries) do not report photocopying activity when it occurs 
(because such reporting is voluntary and there is little expenditure of resources by the 
CCC on monitoring or enforcement). The CCC has failed to cover its costs of opera¬ 
tions and is sustained only through grants and donations. The reader interested in 
lutther documentation regarding the performance of the CCC is referred to Liebowitz 
(1985). 
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Competition with Hidden Knowledge 


John G. Riley 

l'ntvn'ity of C.ahforvui, L.os Angeles 


In matkets where product quality is diffuse and verification by 
blivets is sufficiently costly, high-quality sellers have an incentive to 
“signal" to buyers by investing in some activity that is more costly for 
low-qualitv sellers Unfortunately, with competition among buyers 
over the price paid foi each level of the signal, there is. in general, no 
Nash equilibrium. However, it is sufficient for equilibrium that (i) 
low-quality sellers would, under symmetric information, choose not 
to enter the market, and (it) the rate at which the marginal cost of 
signaling declines across types is sufficiently large. 


The large and rapidly growing literature on principal-agent problems 
is conveniently divided into papers that focus on problems of hidden 
actions and those that focus on problems of hidden knowledge. The 
latter are also naturally divided into studies of incentive schemes in 
which the principal is a monopolist 1 and those in which a large num¬ 
ber of princ ipals compete for agents’ services. Here we focus on the 
many principal—many agent problem when knowledge is hidden. 

The fust formal discussion of the issues is provided by Spence 
(197d), who examines markets in which sellers (agents) have private 
information about the quality of their products. There is also some 
activity that is less costly for sellers with higher-quality products. Rec¬ 
ognizing that this activity is a potential “signal" of product quality, 
buyers pay a premium for higher levels of the signal. 

Helpful discussions with Larry Kotlikoff. Jim Mirrlees, Sheridan Titinan, Brett 
1 rueman, and Michael Waldman and suggestions by the referees are gratefully ac¬ 
knowledged. This lesearch was supported bv the National Science Foundation. 

' One example is the choice of an income tax scheme by the tax authority when ability 
is unobservable (Mirrlees 1971), Another is the choice of an optimal selling scheme by 
the owner of a unique ob|eci (Riley and Samucfson 1981). 


[ journal of Political Economy, 1985 vol 93, no r >| 

<& 1985 bv The University of Chicago All nghiv resetved 0022-3808/85/950'v0006$01 50 
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Spence modeled all market participants as price takers. Each seller 
observes the market return to signaling and chooses the signal that is 
individually optimal. In equilibrium, all those buyers making trades 
based on signals find that their prior beliefs are confirmed ex post. 

While very much in the spirit of traditional Walrasian, “price¬ 
taking,” models, the conclusion that emerges is strikingly different. 
Instead of there being a unique equilibrium (or possibly a finite set of 
equilibria), Spence shows that market signaling equilibria form a con¬ 
tinuum. 

More recent papers by Riley (1975, 1979a) and Rothschild and 
Stiglitz (1976) make it clear that this result is critically linked to the 
assumption that all individuals are price takers. In the traditional full- 
information equilibrium, each agent is small relative to the markets in 
which he trades and there is no incentive to attempt price competi¬ 
tion—hence the price-taking assumption. However, with the infor¬ 
mational externality that underlies a market signaling equilibrium, it 
is no longer necessarily the case that price-taking behavior is individu¬ 
ally rational. 

Focusing on the application of signaling to the purchase of insur¬ 
ance, Rothschild and Stiglitz consider the simplest case of two types of 
agents—high and low risk. They show that, unless the proportion of 
high-risk types is sufficiently great, all the “Walrasian” signaling 
equilibria are unstable. That is, there is always some alternative op¬ 
portunity open to a buyer that, in the absence of reactions by other 
buyers, generates strictly greater expected profits. Equivalently, if the 
market is modeled as a noncooperative game, in which the buyers 
(principals) first announce what they will pay for different levels of 
the signal and sellers (agents) then respond, there is no Nash equilib¬ 
rium in pure strategies. 

My own papers focus primarily on the opposite polar case—a con¬ 
tinuum of agents. Adopting the game-theoretic terminology, a cen¬ 
tral conclusion is that nonexistence is generic in the class of models 
considered by Spence. In particular, raising the price offered to those 
choosing the lowest observed level of the signal is always profitable. 
Moreover, price competition may be profitable at higher levels of the 
signal as well. 

There have been several attempts to overcome this failure to ex¬ 
plain signaling behavior. Each of these builds on a paper by Wilson 
(1977), which begins with the premise that agents will anticipate the 
responses of others when they consider new actions. The least de¬ 
manding of the alternative equilibrium concepts is the “reactive equi¬ 
librium." 2 Loosely, a set of strategies sf ,... ,s* for n competing agents 

2 For a comparison of three alternative non-Nash-equilibrium concepts see Riley 
I I979i). 
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is a reactive equilibrium if two conditions are satisfied. First, for any 
agent / and any alternative strategy s, that raises is payoff there is 
another agent y who can benefit by reacting at the expense of agent i. 
Second, there is no further reaction by a further agent that can make 
agent fs reactions unprofitable. The idea then is that agent i will 
recognize agent fs clear incentive to react and therefore will be de¬ 
terred from choosing s, rather than s*. 

As argued in Riley (1979 a), of the sets of signal-price contracts that 
separate out the different types of agents, there is a unique Pareto- 
dominating set, and this is a reactive equilibrium. Moreover, there can 
be no reactive equilibrium in which high-quality, low-signaling-cost 
agents are pooled with low-quality, high-signaling-cost agents. Thus 


the reactive equilibrium is unique. 

While the assumption of this greater level of sophistication is plausi¬ 
ble for some applications of the theory, there are other applications 
(or which it is possible to take a more skeptical view. Thus, in this 
paper, an alternative way out of the nonexistence dilemma is exam¬ 
ined. Instead of modifying the equilibrium concept, the route chosen 
is the adaption of the model itself. It is argued that, despite the nega¬ 
tive conclusions of the published literature, there is a family of signal¬ 
ing models that generate an equilibrium satisfying the strong Nash 
equilibrium condition that all price competition must be unprofitable, 
in the absence of reactions by other price setters. 

These models differ from those appearing in the literature in only 
one critical way. Rather than assume that all agents would enter a 
particular market in a world of perfect information, I assume that, 
even in such a world, a positive fraction of the agents would choose 
not to participate. In the labor market, for example, suppose 0 G [0, 
1] is the productivity of a given type in the production of a particular 
commodity. Then, as long as there is some alternative job opportunity 
of fering any worker a wage w A only those for whom 0 > w A have an 
incentive to produce this commodity. 

Similarly, in the signaling of project quality by insider stockholding 
(Leland and Pyle 1977) and the signaling of loan quality by collateral 
or loan size (Milde and Riley 1984; Bester 1985), those entrepreneurs 
with sufficiently low-quality offerings would not be financed in a 
world of costless information about quality. 

Even in insurance markets, with perfect information about loss 
probabilities, nonparticipation will often be plausible. Under fair in¬ 
surance, the risk of loss L with probability p will be fully covered by a 
premium pL. Then those with sufficiently high probabilities of loss 
may be better off not undertaking the risky activity. 

This simple modification of the basic Spencian model is important 
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because it eliminates the profitability of price competition at the low¬ 
est observed level of the signal. The primary focus of the paper is 
then to seek out conditions under which price competition is also 
unprofitable at higher levels of the signal. As Spence emphasized, an 
activity is a potential signal if it is less costly for those agents selling 
products with a higher-quality product. The central result of this 
paper is that if the proportional rate of decrease of the marginal cost 
of signaling, with respect to quality, is sufficiently high, there exists a 
Nash equilibrium. That is, price competition is never profitable. 

The paper is organized as follows. Section I lays out a principal- 
agent model with hidden knowledge. Section 11 examines, in detail, 
the case in which there is a discrete number of types of sellers. In 
particular, conditions are derived under which there is a Nash equi¬ 
librium for both the labor market model of Spence and the insurance 
market model of Rothschild and Stiglitz. Section III considers the 
labor market model under the assumption of a continuum of agents 
and shows that, in this limiting case as well, there are reasonable 
conditions under which a Nash equilibrium exists. Some concluding 
remarks appear in Section IV. 


I. A Many Principal-Many Agent Model 
with Hidden Knowledge 

Consider a market in which each of the set of potential sellers (agents) 
can provide one unit of a commodity or service. Sellers can also 
choose the level, s, at which to engage in some sales-related activity, 
that is, to “signal." Differences among sellers are assumed to be pa- 
rameterizable by a single hidden characteristic 0 6 0. We shall there¬ 
fore refer to a seller as being of “type 9.” 

A contract (j, r) between a buyer (principal) and seller is a payment 
r in return for signal level s. If a type 0 seller accepts { 5 , r) the value of 
his product is V(0, s) so that the buyer’s profit is 

11(0, s, r) = V(0, i) - r. (1) 

Types are parameterized so that higher levels of 0 imply higher 
product value V(0, s). It is also assumed that V is nondecreasing in s. 

Preferences over alternative offers ( s , r) are represented by the 
utility function £/(0, s, r), where U is increasing in the return r and 
decreasing in s. For every seller there is also a mutually exclusive 
alternative to trading in this market that yields a utility level U A . 

Finally, assume that the marginal cost of signaling, that is, the in¬ 
crease in return required for a seller to be willing to increase his 
signaling activity, diminishes with 0. Formally, 
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dr 
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ds 


(0, r) 


dU 

dr 


(0, s, r) 


( 2 ) 


decreases 
the choice 


with 0. Condition (2) guarantees that, for any set of offers, 
of signal level .v(0) will be nondecreasing in 0. 


II. Nash Equilibrium with a Finite Number 
of Types of Seller 

Rather than discuss tfiis model in abstract terms, let us begin with a 
simple example of labor market signaling. A type 0 worker who 
t booses signal level r has a value to each of the firms in some industry 
of 

V(Q. v) = 0. (S) 

Each worker also has an opportunity to work elsewhere for a wage r A . 
[ fie cost of signaling at level s is C(0, ,v); thus the net return to type 0 if 
offered the wage contract (s, r ) is 

C(0, 4 , r) = , - C(0, s). (4) 

Condition (2) then reduces simply to the requirement that the mar¬ 
ginal cost of signaling dC/ds be lower for more productive workers. 

As Rothschild and Stiglitz (197fi) showed for the two-type case, no 
contract that attracts more than one type can be part of a Nash (or 
stable Walrasian) equilibrium. More generally (see Wilson 1977) we 
have the following proposition. 

Proposition 1. A Nash equilibrium contains no pools, 
divert this result we need only examine sets of contracts that sepa¬ 
rate out all those types who choose to signal. To simplify the analysis 
further we assume there are just three types of agents set that 0 = {0 O , 
0j, 0 2 }. We further assume that 

0(> < r .4 < e, < 0 2 . (5) 

Each worker chooses the contract (.v, r) that maximizes his net gain 
f (0. s, r). Moreover, if more than one contract yields the same utility 
we assume that the contract selected is the one with the lowest level of 
the signal. 1 

One possible set of contracts that separates out the three types is the 
set {£(,, E 1 , £ 2 } depicted in figure 1. Type 0 O , with the steepest indiffer- 


’ This essentially technical problem, o( nonunique optimal choices, disappears when 
types are distributed continuously 
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return to 



Fir, 1 —Pareto-efficient separating set of contracts 


ence map, U i) = U( 0 O , r, 5 ), chooses the contract E () . Type 0j, with 
indifference map U 1 = U{ 6 t , r, 5 ), chooses E\. Finally, type 0 2 , with 
the least steep indifference map, U 2 = t/(B 2 , r, s), chooses £ 2 . 

Note that only those workers with productivity exceeding r A find 
signaling desirable. Thus the allocation of workers between the two 
industries is efficient. Note also that the profit on each contract is 
7 ero. Note, finally, that each type 0, is indifferent between his choice E, 
and the choice E, + 1 of type 0,+ 1 . 

It should therefore be intuitively clear that, of all sets of contracts 
that separate out the different types, the set {£ () , E 1 , £ 2 } is Pareto 
efficient. Formally, modifying only slightly arguments in Riley 
(1979a) and Engers and Fernandes (1984), we have the following 
result. 

Proposition 2. Characterization of the Pareto-efficient set of separating 
contracts. Suppose the hypotheses of Section 1 are satisfied. Then, of 
all the sets of contracts that are individually not unprofitable and 
separate out those types who signal, there is a unique set that is Pareto 
efficient for the agents. This set (i) allocates types efficiently between 
those who signal and those who do not; (ii) generates zero profits on 
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ind if the set of types is discrete, (iii) has the property 
^if^'the choice of type 6„ 


E, ~~ E 1+ j. 

e, 


As shown in figure 1, {£<„ E<A « not a Nash equilibrium. Note 
that any offer in the dotted region is attractive to^types 0, and 0 2 . 
1 hen* if the average productivity of these two types, 0 I2 , is as depicted, 
any alternative offer in the interior of the diagonally shaded region is 
strictly profitable. ‘ 

Given the distribution of types who choose to signal, the question 
we wish to address here is whether there are conditions under which 
.such profitable alternatives do not exist. 

First of all, as long as the proportion of type 0n, who choose not to 
signal, is sufficiently large, it will never be profitable to make an offer 
that attracts all three types. Given this assumption, the diagonally 
shaded region is the entire set of potentially profitable alternatives. 
Since the indifference curve U( 0|, s, r) = U E bounding this set is 
upward sloping, the most profitable of these alternatives is the point 
D, where the indifference curves 

r) = f/(0o, 0 , r A ) = U% (6) 

f/(0 2 , s, r) = f/(0 2 , c 2 , r 2 ) U't : 

intersect. 

Holding fixed the preferences of type 0 O , we can vary D by altering 
the shape of the indiff erence curves of the other two types. We then 
seek conditions under which r„ > 0 12 . Clearly this will be the case if 
XI) < 0 2 — 0 I2 , that is, if 

XD < 9it - 0 I2 _ /, 

XZ 0 2 - 0, /, + / 2 ’ 

where f is the proportion of type 0, in the population. Since XZ ex¬ 
ceeds XY, it follows that a sufficient condition for r»> to exceed 0, 2 is 
that 


XD ft 
XY /, + f 2 ' 


(7) 


' Using hg 1, it is easy to see why, in a Nash equilibrium, there can be no pooling of 
different types. Suppose, e.g., that both type 1 and type 2 were to choose the contract 
i'n. '11 > For this contract to be not unprofitable, r D « But then there is always 
an alternative contract, indicated by ihe point T in the figure, that is preferred only by 
type flj and that is strictly profitable. 
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jfj S?) C(^2i 5/)) __ L„ 

f\ 4-/2 C(0),s 2 ) - C(0,. s n ) ~ 
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* J1L (« 

. j „ 65 ’ ’ 


j )ds 


c 

C-s-^ 


* max 

j e (in, j«| 


ac 

8 s 


(•*.*) 


ac 

8j 


(0i,O 


Therefore wage competition is unprofitable if, for all s, 

dC 

/. 


8j 

He 

ds 


-( 02,0 


(01, t) 


/1 + /*2 
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( 8 ) 


For the three-type case we have therefore proved the following prop¬ 
osition. 

Proposition 3. Sufficient conditions for a Nash equilibrium: the separable 
case. Suppose alternative opportunities are such that a large number 
of sellers with low-value products would, in a world of full informa¬ 
tion, choose not to enter the market. Then, if the proportional rate of 
decline with 0 of the marginal cost of signaling, 3C(0, s)lds, is 
sufficiently large, the Pareto-efficient set of separating contracts is a 
Nash equilibrium. 

With more than three types it should be clear that the same argu¬ 
ment will hold for every potential pool of two types. Actually, an 
almost identical argument can be used for larger pools as well. Thus 
the proposition is quite general. 

I now show how the result can be extended to the more general case 
in which the utility function, (7(0, s, r), is nonseparable and the valua¬ 
tion function, F(0 , j), depends not only on type but also on the level of 
the signal s. 

Proposition 4. Sufficient conditions for a Nash equilibrium. Suppose 
alternative opportunities are such that a large number of sellers with 
low-value products would, in a world of full information, choose not 
to enter the market. Suppose also that, for the general model of 
Section I, the mafginal cost of signaling is, for each of n types of 
agents, nonincreasing in the return to signaling, r. Then the Pareto- 
efficient set of separating contracts is a Nash equilibrium whenever 
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i) dr 


50 ds 

t ■(#. ». >) = t 



ds 

l '<«. i, r 1 - r 


is sufficiently large. 

As above I analyze the special case with just three types. The 
generalization to n types is straightforward. Consider figure J again. 
Since i) is nondecreasing in v, the average value of types 0 t and 0 2 , 
if they both choose the contract D, is no greater than 

r, . . _ + f*W*j*l 


I lien, just as in the earlier argument, there are no profitable alterna¬ 
tives if XU/XY is sufficiently small. But 




ds. 


r-' 


where the integral is along the arc IJ 1 = V\ and 




ds. 


along the arc U l = U J,. By hypothesis (dr/ds)\r is decreasing in r. Then 

ds. 


along the arc U l — (//., so that 


XU c- 

P J dr 

'Vi ds f 

, so that 

P-’ dr 1 

XU 

k, rh 1 

XY 

P' dr 


k,, ds 


ds 


ds 


where both integrals are along the arc V x — U 


max 


arc 

V x 

dr 


ds 

V* 

\ 1 


ds 

I 


along the arc U l = U^. 

By making the proportional rate of decline in the marginal cost of 
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signaling with respect to 8 sufficiently large, the right-hand side of 
this inequality can be made arbitrarily small. Then the offer D can be 
chosen^arbitrarily close to the horizontal line r = V(0 2 , s 2 ) and hence 
above V^ta)- Q.E.D. 

This more general result, as well as covering generalizations of 
Spence’s (1974) labor market model to allow for a direct productivity¬ 
enhancing effect of the signal, can also be applied to Rothschild and 
Stiglitz’s (1976) model of insurance with differing risk classes. In the 
latter model, each individual, with von Neumann—Morgenstern util¬ 
ity function «(•) and initial wealth w, can insure himself against a loss L 
by paying a premium p. Alternatively, by coinsuring, that is, accepting 
a deductible of s, the individual receives a premium reduction of r. In 
the no-loss state, which occurs with probability 0, wealth is therefore 
to - (p — r) = n + r, where n = co — p. In the loss state wealth is u> — 

L 4- {/_ — s) — (p — r) = n + r - s. Expected utility can then be 

expressed as 

f/(fl, s, r) = 6u(n + r) + (1 - 8)u(n + r — s). (9) 

If a type 0 individual accepts the insurance contract (s, r) the ex¬ 
pected profit on this contract is 

11(6, s, r) = p - r - (1 - 0)(L - s) = V(6. s) - r. (10) 
From (9) the marginal cost of signaling is 

au 

_ ds 

ds\ r = t'<e, ,. r) dU 

dr 


_ (1 - 8 )u’(n + r — $) _ 

8a'(re + r) -I- (1 — 0)u'(n + r — s) 


= _l_ 

j + __0_ u'(n + r) 

1 — 0 u'(n + r — i) 

It follows immediately that, as required, the marginal cost of signaling 
declines with the quality of the insurance risk (the probability of no 
loss, 0). 

We now ask under what conditions the hypotheses of proposition 4 
are most likely to be satisfied in an insurance market. From (11), the 
marginal cost of signaling is nonincreasing in r if u'(rt + r)/u'(n + r — 
s) is nondecreasing in r, that is, if 


- U ' (n + r) - - = [A(n + r - s) - A(n + r)] 


dr u'(n + r — s) 


u'(n + r) 
u‘(n + r —s)' 


(12) 


4 



p <>8 
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<l M.!»ll"tr!.m^E d >l>e rate of decline of ihe marginal cost of sig¬ 
nal,,, s . Dilleremialing the logarithm of (11) by 8 we obtain 

1 _ u'(n + r) 

(1 - 0) 2 u'(tl + r - j) 


a 

dr 


00 

ds 

c 


dr j 



ds | 

L' 


1 + 


6 


u'(n + r) 


1 - 0 u'(n + r — s) 

1 


(1 - 9 f u 'i n -± r - 


s) 


4- 011 - 0) 


Fiom proposition 4, a Nash equilibrium exists if this expression is 
.sufficiently large. Note that as the probability of no loss, 0, approaches 
unity, the denominator approaches zero. Therefore, as long as the 
loss probabilities are all sufficiently low, the Pareto-efficient separat¬ 
ing tontt act set is a Nash equilibrium. 


III. Nash Equilibrium with a Continuous 
Distribution of Types 

While the simple derivations of the previous section provide some 
intuition into the importance of the proportional rate of decline of 
the marginal cost ot signaling, the arguments themselves rest heavily 
on I he assumption that differences between neighboring types are 
discrete. This is easily seen by referring back to the sufficient condi¬ 
tion (8) for the simple labor market model. Note that if the marginal 
cost of signaling varies continuously with 0, 


lim 

a* l 9i 


AC 

ds 


(e,,s) 


d£ 

ds 


(8|,*) 


1 > 


/■ 

fy + h 


Thus condition (8) is only satisfied whenever the difference between 
neighboring types of seller is sufficiently large. However, this result 
does not itself imply that propositions 3 and 4 are false. To demon¬ 
strate this, consider the simple labor market case in which the cost of 
signaling C(s, 0) = s/d. Then U(d„ s, r) = r — (s/0,). Consider figure 1 
once again. Since E\ and D lie on the same indifference curve for type 
00, 


_ *P ~ *1 

e 


( 13 ) 
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S\m’i\ar\y, since E 1 and E 2 lie on an indifference curve for type 8| and 
D and E 2 lie on an indifference curve for type 0 2 , 

e, - e 2 = iLZJs (l4) 

and 


®2 ~ r n = • (I 5 ) 

Multiplying (13) by 80 , (14) by 0), (15) by 0 2 , and then adding we 
obtain 

Of - 6061 - r/,( 0 2 - 0 ») + 0 ,( 0 . - 0 2 ) = 0 . (16) 

Since we are interested in the limit as both 0 2 - 0| and 0| - 0„ 
approach zero, let k = ( 0 2 - 0 ,)/( 8 , - 0 O ). 

Substituting into (16) and rearranging we obtain r„ = (A 0 2 + 0 ,)/(A 
+ 1). Comparing this expression with 2 = (/) 0 , + / 2 0 2 )/(/] + / 2 ), it 
follows that there is a Nash equilibrium if l/(k + 1) < /./(/. + h). 

This conclusion holds even in the limit, as the difference across 
types goes to zero, which strongly suggests that a similar result holds 
when the set of sellers forms a continuum. Unfortunately, analysis of 
the continuous case with nonseparable preferences is extremely intri¬ 
cate. However, clear-cut results are obtainable for the simple labor 
market model. To focus on essentials we make a further simplification 
and assume that the cost of signaling takes on the special multiplica¬ 
tive form C(s, 0) = slm(9), m'(0) > 0 . 3 Workers are assumed to be 
distributed continuously on the interval [o, 6 ), with a < r A < b. The 
cumulative density function for 0 , F(Q), is assumed to be twice con¬ 
tinuously differentiable and strictly increasing on [«, b]. 

From proposition 2 we seek a set of contracts that allocates the 
workers across industries efficiently, separates out all those types who 
signal, and generates zero profits. Thus we seek a wage function r = 
W(s) such that 5 ( 8 ), which solves 


also satisfies 


max 


U[0, s, W(j)l = W(i) - 


w(0) 


W[s(0)3 = r A< 6 « r A 
0, 0 > r A 


(17) 


(18) 


1 The assumption that C(8, s) = s/m( 8) is not as restrictive as it might seem. Sup¬ 
pose instead that z is the level of the signaling activity with signaling cost C(8, z) = 
A(z)/m(9), where d(z) is strictly increasing. Then we can always define the inverse func¬ 
tion z = A “ l (s) and define the equivalent signaling cost function C(8, s) » d(6, A~ ’(s)) 
= i/m (8). 
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Fit. 2.— Pareto-efficient separating wage function 


Such a wage function is illustrated in figure 2. Type d, observing 
thouses s(f»). As required the resulting wage paid, W[j(6)], is equal to 
this worker productivity, 0. 

Suppose, as we shall later confirm, that W(.t) is differentiable. Then, 
for anv type choosing a positive s we require 


Vi "(5(0)1 


m(0) 


(w 


(Combining (18) and (19) yields the ordinary differential equation 
m(W)W'ls) = 1, W(0) = r A . integrating and making use of the bound¬ 
ary condition we obtain 

it tv) 

m{u')dw = .s. (20) 

. t < 

Note that the wage schedule Wfj), given implicitly by (20), is differ¬ 
entiable as hypothesized. We now seek conditions under which this 
schedule is a Nash equilibrium. Actually, we consider only the ques¬ 
tion of whether VV(t) is a local Nash equilibrium. That is, we seek 
conditions tinder which each afternative contract (/, f) sufficiently 
close to the schedule (s, W(s)) is unprofitable.’’ 


* in an earlier version o( this manuscript (Rifey 1983), condition* for a Nash eq , r , l* >■ 
rium arc alw examined It is shown that, as long as one mild additional restriction 
holds, a local Nash equilibrium is also a Nash equilibrium. 




COMMON' ^ . 

,i« , dS3lta£»* t. Ifwi a&maf type r«fcjnst indifferent 
Mween stRiiatinS »«* «* ■%»**** fadtffcWMe cunwthrough 
S??Tmusi! as deputed- be urngwiW to WM « s - 0. Of course. 
°’,h p > r A ivpe r A strictly p trf w i *e new oiler, indeed there is an 
wt .1 ni i.ncs Ira, $) who are strictly better off under the new offer 

sufficiently small, this new offer will aunt* an interval of types with 
average productivity in ewsess of the offered wage. However, with 
Ihe alternative opportunity yielding a wage r 4 < t all those types on 
(he interval |<j. r A \ also find the new offer attractive. Then the average 
productivity of those accepting the new offer is ft UFiF(k). 

As r —* 6 —» r A . and hence the average productivity approaches 
j;: $,!!•'lF{t A ), which is strictly less than r A . Then for t > r A and 
sufficiently close the new offer is unprofitable. 

1 he other alternative » a new offer {1, f) designed to attract all 
(hose types on some interval (0, y). This is illustrated in figure 3. An 
agent of ivpe 0. with indifference curve (‘ 4 through his first signaling 
point (i(p), 0). is just indifferent between the latter and the new* 
aim native. Similarly an agent of type y is just indif ferent, while all 
(hose for whom 0 G (0, y) strictly prefer (i. r“>. 

We next obtain an expression for f in terms of 0 and y and then 
compare this new offer with the average productivity of those accept¬ 
ing « 

Imm (17) the steepness of an indifference curve for type 0 is 
I m(fl). 1 hen (a, f) must satisfy 


JL 


I 


t - 


’ ~ '(0) m<0) 

i-lnninatmg , we then obtain 

'Iwiyi - mO>} 

K,u * tlo it (18) and (20) 


7 _ 


f - A(y) 
7»"(7> ~ 0m<0» - 


i'<7> ~ -'(0)1. 


( 21 ) 


( 22 ) 


'(7) - a (0) * £ m{u')dw 

= 7«(y) ~ 0m(0) - £ W m-(w)du-. 
Su hstu,„i nR (23) into (22). we obtain 


(23) 


r 

f ~ j p 


vm'(w)du< 


*»(7) - w(0) 


( 24 ) 




Fk-. 3 —Interior wage competition 


Thus, to attract workers with productivity in the interval (p, y) an 
employer announces the new offer (.?, t), where t satisfies (24) and s 
satisfies (21). 

To determine the profitability of such an offer we must compare it 
with the average productivity of those accepting, that is, 

effete 

f - &) - Fm • (25) 

To do this we fix P at some arbitrary level and examine the change in f 
and 0 with y as 7 j p. Appealing to L’Hopital’s rule we have the 
following useful result, which is proved in the Appendix. 

Lemma 1. Define y(y) = /$ 0 H'(Q)dQ/[H(y) — H(P)], where //(•) is 
twice continuously differentiable. Then (i) y(P) = P, (ii) y'(P) = V 2 , 
and (iii) V'(P) = //"(P)///'(P)- 

To focus on the effect of the proportional rate of decline in the 
marginal cost of signaling we now make the further simplifying as¬ 
sumption that L(0, s) = j/ 0'. Then 

± /3C\ 

00 l as / 

e = — 0 -. 

dC 

ds 
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the elasticity of the marginal cost of signaling, is constant. Appealing 
to lemma 1, it follows that, for “y — (3 sufficiently small, f exceeds 0 if 
and only if e — 1 > QF'/F'. We have therefore proved the following 
proposition. 

Proposition 5. Necessary and sufficient conditions for a local Nash equi¬ 
librium. With a positive mass choosing the reservation wage r A , and 
with signaling cost function C{0, s) = s/0', the Pareto-efficient separat¬ 
ing wage function is a local Nash equilibrium if the elasticity, with 
respect to 0, of the marginal cost of signaling exceeds the elasticity of 
the density function by more than unity, that is, e > 1 + [0/ r "(©)//-(0)]. 
If the inequality is reversed there is no local Nash equilibrium. 

In addition to noting the close link between this result and those of 
the previous section, it is interesting to consider the benefits from 
signaling for different signaling cost elasticities. From (23), with m(w) 
= wf j>(0) = i^w'dw. Therefore, the equilibrium cost of signaling for 
type 0 is 

qe, ,(6)] .«- = [' (f)'*». 

Note that the integrand is a decreasing function of e for all 8. We have 
therefore proved the following proposition. 

Proposition 6. Ranking signaling technologies. With signaling cost 
function C(0, s) = j/0'and productivity independent of 5, the higher is 
e (and hence the larger is the proportional rate of decline in the 
marginal cost of signaling), the greater is the equilibrium return to all 
those signaling. 

Combining propositions 5 and 6 we observe that those values of e 
that generate sufficiently large potential gains from signaling also 
lead to the existence of a Nash equilibrium. Thus, at least in the labor 
market case, the equilibrium problems tend to arise only when the 
potential gains from signaling are small. 

IV. Concluding Remarks 

Taken together, the results above indicate that there are quite rea¬ 
sonable assumptions, for both the insurance market and labor market 
applications, under which the many principal-many agent problem 
has a (unique) Nash equilibrium. Furthermore, in the labor market 
application, the sufficient conditions for a Nash equilibrium are 
satisfied whenever the gains to those signaling are sufficiently large. 

However, it should be emphasized that, for the limiting case of a 
continuum of agents, we have derived our result only under simpli¬ 
fying separability assumptions. While it is my conjecture that these 
assumptions are not necessary, further generalization seems likely to 
be technically intricate. 
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Turning to more fundamental theoretical issues, it should be noted 
that all the published literature makes the key assumption that there 
is only one hidden characteristic. Therefore a further important step 
will he to develop models with multiple characteristics. Some prelimi¬ 
nary work by Engers (1984) suggests that parallel results are possible 
with equal numbers of characteristics and signals. However, this area 
remains largely unexplored. 

Another theoretical simplification made in this paper is the as- 
sumption that the opportunity cost of choosing to signal at all is the 
same for each type. Especially in the labor market case it seems much 
more plausible that those workers with a high productivity will have a 
higher opportunity cost. While introduction of a reservation utility 
(,’,(0), which varies across types, complicates the technical details, it is 
clear from (he recent work by Engers and Fernandez (1984) that the 
conclusions are essentially unchanged. 

A further feature of the model deserving clarification is the order¬ 
ing of the players’ moves. In the analysis here, ii is the uninformed 
buyers who must make the first move, each announcing a menu of 
cuntiacts to the sellers. While this is the natural assumption for some 
applications, there are others in which the order is reversed. For 
example, in the signaling of higher earnings via dividend increases 
(Bhatiacharya 1979), it is the uninformed outsiders who respond to 
the dividend signals of the informed insiders. As Stiglitz and Weiss 
(198(1) show, the problem in such models is not the lack of a Nash 
equilibrium but the plethora of such equilibria. Very recently, how¬ 
ever, Kreps (1984) has argued that the only “stable” Nash equilibrium 
is the Pareto-dominating separating set of contracts examined in this 
paper. 

Finally, it would be incomplete to finish without some comment on 
how behavior can be modeled when the underlying assumptions im¬ 
ply that no Nash equilibrium exists. As indicated in the Introduction, 
there have been various attempts to model an equilibrium in which 
principals lake into account anticipated reactions when considering 
alternative actions. To illustrate, consider figure I once more. Since 
the contract set {£' () , E i, £ 2 ) is the Pareto-efficient separating set, any 
new offer, such as D, that generates strictly positive expected profits 
must involve pooling. Then there is always a reaction such as T that 
skims the cream from the pool. As a result the initial "defection” D 
generates losses while the reaction T makes profits on every agent 
who accepts it. It thus seems plausible that the principal considering 
the defection D will recognize that the reaction T poses a serious 
threat. As a result he will be deterred from choosing to offer D. The 
Pareto-efficient separating set is then a reactive equilibrium. 

The crucial step in this argument is the assumption that at least one 
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other principal will be able to exploit the opportunity arising from the 
announcement of a new offer, such as D, before the new offer has 
generated significant profits. (Alternatively, once offered, D cannot 
be quickly withdrawn, so that the initial profits are offset by later 
losses as other principals respond.) Therefore, in using a non-Nash 
equilibrium concept to model behavior in some specific market it is 
important to begin by considering the reasonableness of the quick 
reaction hypothesis. 


Appendix 


Lemma 1. Define y(*y) = SH 9H'(Q)d6/[H(y) - ff(3)]. where H(-) is twice con¬ 
tinuously differentiable. Then (i) y((J) = (3, (ii) y'((i) = Vi, and (iii) y"(|J) = 
P). 

Proof. Conclusion i follows from a direct application of L’Hopital’s rule. To 
prove ii define A(y) = H(y) - //(($). Then we can rewrite y(y) as 


>(T) = 



Integrating the numerator of this expression by parts we obtain 

P A(6)d0 


y(y) = y - 

Next differentiating by y we obtain 


A(-y) 


(Al) 


y'(y) = 


A'(y) £ A(0)t/e 


A(y) z 


<A2) 


Applying L’HApital’s rule twice and noting that A((J) = 0 we obtain ii. 
Using ii and (A2) we can also write 


y'(y) - /(P) _ 
y - P 


A'(y) I A(d)d6 - Vi A(y¥ 
•g_ 

(7 - pvu-y ) 2 


(A3) 


As y —* p the left-hand side approaches y"(p). Applying L’HApital’s rule three 
times, the right-hand side approaches A "(Cb/SA’((}). This proves iii, since the 
derivatives of A and H are identical. Q.E.D. 
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The Effect of Professional Advice on the 
Stability of a Speculative Market 


Frank T. Denton 

McMaster University 


The paper is motivated by the apparent inconsistency of situations in 
which speculators rely heavily on relatively few advisers when theory 
and evidence indicate that market fluctuations are predominantly 
random and unpredictable. A highly stylized model is used to show 
how such situations can arise if speculators choose advisers on the 
basis of observed track records and how block behavior and in¬ 
creased market instability can result. It is argued that inferences 
based on track records may represent a confusion of luck with skill 
or, more formally, an inappropriate choice of null hypothesis to be 
tested. Other related issues are discussed. 


I. Introduction 

One of the more intriguing aspects of speculative markets is the role 
played by professional advisers and the apparent weight given to the 
views of a relatively small number of "influential,” “highly regarded,” 
or “prestigious” prognosticators. To take one example, the Florida- 
based investment adviser Joseph Granville “made himself famous by 
spurring a 23-point plunge in the Dow Jones average on January 7 
[1981]. Granville’s terse advice, ‘sell everything,’ was delivered 
worldwide by Telex and telephone throughout the evening of the 
preceding Tuesday. What followed the opening gong in New York 


The following were kind enough to comment on an earlier dratt ot this paper: 
Jeffrey Callen, Mel Kliman, Itzhak Krinsky, Leslie Robb, Byron Spencer. Gordon Tul- 
lock, and Doug Welland. I am grateful for their suggeslvons and for those ot George 
Sugler and the anonymous referees. However, responsibility lor all remaining 
deficiencies ot the paper rests with me. 
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was a selling fury that battered the Dow down below 1,000 in the 
biggest single day's trading of the exchange’s first 188 years” ("Kauf¬ 
man Triggers the Latest Stampede” 1982, p. 36). To take another 
example, in August 1982 Henry Kaufman of the New York firm of 
Salomon Brothers announced a lowering of his interest rate forecast 
and was considered thereby to have initiated a massive wave of buying 
and a concomitant sharp rise in share prices throughout the world. 
“Almost fiom the moment Kaufman’s memo hit the wires . . . the 
market went wild. . . . On the day following Kaufman’s pronounce¬ 
ment, nearly 133 million shares changed hands on London’s frantic 
slock exchange . .. and the leading-shares index shot up more than 10 
points" ("Wall Street’s Wildest Week” 1982, p. 20). “Kaufman has 
become almost a public celebrity—overshadowing other would-be 
Wall Sireet gurus. . . . [His] weekly pronouncements on interest rates 
are eagerly anticipated and read by many Wall Streeters as if they 
were stone tablets from Mount Sinai” (“Kaufman Triggers the Latest 
Stampede” 1982, p. 36). Even allowing for a certain amount of jour¬ 
nalistic hyperbole in these descriptions, it seems clear that the com¬ 
munity of speculative investors does pay much closer attention to 
some forecasters and advisers than to others. 

Now how can this be? What is the process by which “gurus of the 
market place” are identified, and how can their opinions have such 
leverage in large-scale speculative markets when theoretical consider¬ 
ations and the mass of objective statistical evidence show these mar¬ 
kets to be dominated by random and inherently unpredictable fluctu¬ 
ations? One possibility is that some forecasters do have greater ability 
or better information and that market participants recognize this and 
pay special heed to their advice. If that is the case, one would expect 
the advantages to be small—to be such as to raise the probability of 
calling correctly the direction of change of the price of gold or the 
Dow [ones industrial index from .5 to .52 or .55, say, but not from .5 
to .9. That would still be reason enough to accept the (probabilisti¬ 
cally) superior advice and thereby to increase one’s expected level of 
profit. However, there are two difficulties with this model. The first is 
that small differences in success probabilities would be hard to detect 
in the kind of uncontrolled experimental situation characteristic of 
real markets. The second, and more fundamental one, is that, if the 
market is a zero-sum game and all participants are equally rational 
and well informed, 1 expected profit is necessarily zero for everyone, 
and hence the hypothesis of superior advice is untenable: some can- 

' It the market game has a nonzero sum. it can be reduced to a zero-sum game simply 
by considering differences between individual and market average profits 1 he crite¬ 
rion tor evaluating advice is then whether taking the advice yields a profit expectation 
greater than the average for the market as a whole. 
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not profit from advice at the expense of others, for the same advice is 
available to all—or, as Tirole (1982, p. 1179) puts it, “not everyone 
can possess ‘better than average' information.” 

Another possibility—the one explored in this paper—is that mar¬ 
ket participants choose their advisers on the basis of an incorrect 
assessment of “track records.” In commonsense terms, they make the 
mistake of confusing luck with skill. More formally, their error 
amounts to an inappropriate choice of null hypothesis in a test of 
whether forecasting success could have occurred just by chance. The 
paper argues that this would be a quite natural error to make and that 
its effect would be to increase the instability of a large-scale specula¬ 
tive market, perhaps sharply, by increasing the tendency toward block 
behavior in buying and selling. In support of this argument the paper 
makes use of a stylized model of a market game and an associated 
process of adviser selection. The model is developed informally in 
Sections II and III and formally in Section IV. The question of ra¬ 
tionality in adviser selection is taken up in Section V. Further observa¬ 
tions on differences between the model and real-world markets are 
offered in Section VI and a final summary in Section VII. 


II. The Buy-and-Sell Game 

The development of the model is facilitated by a visit to Birdland, a 
remote island in the middle of nowhere. The inhabitants of Birdland 
are of two kinds, Sparrows and Owls. The Sparrows are by far the 
more numerous: for all practical purposes their number (M) may be 
regarded as infinite, whereas the number of Owls (N) must be re¬ 
garded as finite. 

A game known as Buy-and-Sell was introduced into Birdland many 
years ago and became very popular within the Sparrow community. 
In its original form the game was played with cards and in small 
groups. At the beginning of a round, each player would put one bill 
(the unit of currency in Birdland) into a cardboard box known as the 
“market,” in return for which he would be deemed to have purchased 
one market share. (Share certificates were not actually issued as this 
was unnecessary.) The player would then be given two cards, one with 
the word “buy” printed on it, the other with the word “sell." He would 
choose one of these cards and place it face down on the table in front 
of him, When all players had done this, the cards would be turned 
over and the numbers of buyers and sellers counted. The market 
manager would then declare the price (p) of a market share in that 
round as equal to the ratio of buyers to sellers: p — ir/(l — it), where it 
is the proportion of buyers. Each seller would receive the amount p 
and each buyer the amount 1 Ip from the market. Thus when there 
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were more buyers than sellers, the sellers would be the winners: each 
seller would receive p > 1 for his initial payment of one bill, while 
each buyer would get back only l/p < 1. Conversely, when there were 
more sellers than buyers, the buyers would be the winners, each buyer 
receiving \lp > 1 and each seller only p < 1 . 

A minor problem arose if, when the cards were turned over, it was 
found either that all players were buyers or that all were sellers. To 
avoid this the game was modified slightly. One of the cards in the buy 
deck had the words “designated buyer” on it, and one of the cards in 
the sell deck had the words “designated seller.” The two (different) 
players who were dealt these cards were required to declare them 
immediately and to place them face up on the table. This ensured that 
there would always be at least one buyer and one seller, that a price 
would always exist, and that the market would always clear at the end 
of a round. The game then had a zero-sum outcome in all circum¬ 
stances: if there were m players, sellers would receive gains (losses) 
totaling m( 1 - tt)(/> — 1), and buyers would receive losses (gains) 
totaling mir[(l/p) — lj = — m(l — it)(p — l). 2 

The aim of a player in the Buy-and-Sell game was obviously to 
outguess the other players—to be a buyer when sellers predominated 
and a seller when buyers predominated. In that regard the game was 
similar to many other such games played throughout the world. Buy- 
and-Sell was played one round at a time rather than continuously, as 
in a securities market. However, that is not a fundamental difference, 
lor one can think of the sequence of rounds as an approximation to a 
continuous market process, with the price conveniently normalized to 
unity after each round. What establishes kinship between Buy-and- 
Sell and a real-world speculative market is that in both cases each 
player bases his action on what he thinks the other players will do. 

Buy-and-Sell became so popular that the card game version of it 
was replaced by an electronic form that permitted any number of 
Sparrows to participate at the same time. Each player now had a 
remote terminal with a keyboard, and the market was a central com¬ 
puter instead of a cardboard box. A player would key in his identifica¬ 
tion number and then press either the buy key or the sell key on his 
terminal. The numbers of buyers and sellers were totaled, the price 
computed, and the gains and losses recorded. Each player had an 
account registered with the market, and his account was automatically 
debited or credited with the appropriate amount at the end of a 
round. As before, a designated buyer and a designated seller were 
chosen at random so that a price would always be established. One 
round of the game was played in each time period. 


*’ A nondegenerate version of the game requires p = ir/(l - it) since the market- 
clearing equation mn[( \/p) - 1] + m(l - ir )(p - 1) = 0 has roots it/( 1 - it) and unity. 
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A strange thing happened when the game was first played in the 
new electronic form. The number of players was very large—virtually 
infinite, in fact, since all the M Sparrows played. Each Sparrow made 
his buy-sell choice independently, and when the results of the first 
round were announced, it was found that there were no winners and 
no losers', half the players were buyers and half were sellers, the price 
was unity, and gains and losses were zero. This was the outcome of the 
second round as well, and of the third and the fourth. Being unfamil¬ 
iar with the law of large numbers, the Sparrows were bewildered and 
disappointed. They had expected greater excitement and the possibil¬ 
ity of large profits with so many players now in the game. Instead, 
they found that the price was perfectly stable and that there were no 
profits at all. It was at this point that the Sparrows turned to the Owls 
for advice. 

III. Professional Counseling as a Source of 
Instability 

The Owls were wise-looking birds. They were known to have studied 
at various institutes and centers of learning throughout the world, 
and their bookshelves were lined with impressive-looking volumes on 
econorpetric techniques, time-series models, the theory of technical 
analysis, and other such subjects. {Some also had books on astrology 
and tea cup reading, but these they kept out of sight.) The Owls were 
delighted when the Sparrows came to them for advice, for they saw an 
opportunity to put their knowledge and training to use. They would 
have to charge a modest fee, of course, but that did not deter the 
Sparrows. There were huge numbers of bills involved in the Buy-and- 
Sell game now, and a Sparrow who could outguess the other players 
could make a handsome profit. A modest fee was a small price to pay 
for good advice/' 

There were n 1 Owl advisers in the first period in which their advice 
was sought, and since all the Owls participated, nj was equal to N. As 
the Owls were equally wise looking, there was no basis for choosing 
one over another. The Sparrows therefore divided themselves evenly 
among their new advisers, so that each Owl had (M — 2 )/n t clients (M 
— 2 rather than M because of the designated buyer and seller). 

As the day of the first test approached, the Owls came to realize that 
they had not a clue as to how the game would turn out—that in fact 
they were no better able to forecast the outcome than were the Spar¬ 
rows. However, they were not about to admit that (especially since 


1 The introduction of fees meant that the game now had a negative-sum rather than 
a zero-sum outcome. However, the Sparrows were oblivious to this; they had their eves 
hrmly fixed on the opportunities for individual profits. 
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their fees had been paid in advance). Faced with the necessity of 
making a prediction, each Owl retired quietly to his study and tossed a 
coin. If heads came up he told his clients to buy; if tails, he told them 
to sell. The Sjaarrows eagerly awaited the word from their advisers, 
and when it came they accepted it without question: if an Owl said 
“buy.” all his clients bought; if he said "sell,” all of them sold. Thus 
instead of making their transaction decisions independently, as they 
had before, the Sparrows now acted in blocks, so that effectively there 
were only >h transactors (aside from the designated buyer and seller) 
rather than M - 2. 

In the first period in which the game was played with advice from 
the Owls the price did deviate from unity—not drastically, but still 
enough to yield the winners a little profit. The winners were pleased, 
of coutse, and congratulated themselves on their choice of Owls. The 
losets, on the other hand, were anything but pleased: not only had 
they lost money on the basis of bad advice, they had paid good money 
for that advice. They immediately switched their allegiance to those 
Owls who had predicted correctly in the first period. Thus in the 
second period there were n 2 < ri] adviser Owls, each with (M - 2)/« 2 
> (M — 2)/ti| clients. The other N - rt 2 Owls continued to make 
predictions, but no one paid any attention to them. They had been 
discredited. 

The same thing happened in the third period: < n 2 of the Owls 
had predicted correctly in the second period (as well as the first), and 
the disgruntled loser Sparrows immediately switched their allegiance, 
leaving N - Owls without clients. And so it went in the subsequent 
periods. The number of Owls whose advice was heeded grew smaller 
and smaller.' Concomitantly, the price tended to fluctuate more and 
more widely. The market was clearly becoming more unstable as time 
went on. 

Those Owls who had predicted correctly in every period came to be 
known as “Wise Owls,” whereas those who had failed to achieve this 
distinction were known simply as “Ordinary Owls.” One of the Wise 
Owls after hve periods was Harry, and one of Harry’s clients was a 
Sparrow named Fred. ’ Now Fred had taken a course in probability 
and statistics in his younger days, and he remembered that even rare 
events happen with nonzero probability and that one can carry out a 


1 One might have thought that a discredited Owl would have been allowed to rebuild 
his reputation if his subsequent forecasts proved accurate and, consequently, that he 
could at least have hoped to recover his clientele as time passed. However, in this period 
of Birdland history the Sparrows were unforgiving 1 once discredited, always dis¬ 
credited. 

’ Disclaimer- All characters in this story are fictional. Any resemblance to real birds, 
living or dead, is purely coincidental. 
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formal test of the hypothesis that an occurrence was due simply to 
chance. Impressive as Harry’s record seemed, could it be merely the 
result of a string of good luck? Fred reasoned that if Harry had no 
special forecasting skill his probability of success in any one period 
would be Va and his probability of five successes in a row would be ('/a)'’ 
= l /s 2 . In other words, the odds would be more than 30:1 against such 
an achievement. All lingering doubt was dispelled: Harry must in¬ 
deed be a Wise Owl! (Fred had made a mistake in assuming the 
probability of a success to be '/a. However, had he done the calculation 
correctly he would have found the probability of five consecutive 
successes to be even smaller, so that his conclusion would have been 
even more strongly supported. Fred’s more serious mistake was in 
failing to realize that he was testing an inappropriate null hypothesis, 
as we shall see later.) 

As the set of Wise Owls grew smaller and smaller, the remaining 
members looked even wiser and more solemn than before. They did 
complicated time-series analyses and constructed complex computer 
models. (Some of the discredited Owls were heard to refer to these 
models as “random number generators,” but most of the Sparrows 
did not understand the term, and those who did put it down to sour 
grapes.) The Wise Owls explained to the Sparrows that accurate fore¬ 
casting was a costly business, pointed to their amazing records of 
success, and raised their fees. I he Birdland financial press published 
interviews with them and gave front-page coverage to their latest 
pronouncements. Their names were household words. In view of 
their obviously superior forecasting abilities the increase in fees was 
accepted by the Sparrows without complaint. 

It came to pass after some further number of periods that there 
were only three Wise Owls left. In the very next period two of these 
said “buy” and one said “sell." Those Sparrows who were sellers won, 
of course, and now there was only one Wise Owl, an impressive- 
looking bird by the name of George. What an astute forecaster 
George must be! All the Sparrows Hocked to him and waited eagerly 
for his advice. “Sell,” said George, and sell they all did. The price 
plummeted to l/(M — l), the designated buyer became an instant 
millionaire, all the other Sparrows lost, and George went into the 
Birdland history books as the author of the Great Market Crash.** 

When the game did begin again it was on a new basis. With the 
wisdom of hindsight the Sparrows could see that they had been too 
demanding: infallibility was just too severe a requirement. “To err is 


h Fortunately tor George, he too had become wealthy, having garnered all the adviser 
fees in the last round. He was therefore able to bear his disgrace wuh some degree of 
equanimity. 
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birdlike,” they said. “From now on we shall not ask for a perfect 
record, but merely a good one.” Moreover, they began to think that 
old achievements might no longer be relevant and that perhaps they 
should look back only a limited time in assessing the forecasting rec¬ 
ords of their advisers. The Owls reinforced this view. They talked 
about structural changes, recent advances in forecasting techniques, 
more sophisticated computer models, and other such things. The 
Sparrows did not know what all that meant, but they did get the 
message that old data should not count anymore. Henceforth they 
would look back only a certain number of periods and choose as 
advisers those Owls who had predicted successfully in a specified 
(high) proportion of those periods. 

IV. A Formal Model 

One of the Owls who had been discredited in an early period had 
reined to the local university, where he gave lectures on forecasting 
and speculation for profit. Between lectures he wrote learned papers. 
One o( his papers was entitled “ The Theory of the Buy-and-Sell 
Game.” 

Professor Owl began by noting that if all players except the desig¬ 
nated buyer and the designated seller were to make their buy-sell 
decisions independently, without benefit of advice from the Owls, the 
proportion of buyers, tt, would be determined on the sample space S 
= { 1/Af, 2/M, . . . , (M — l)/M} and would have the binomial distribu¬ 
tion for M - 2 independent trials with a probability of success in any 
one trial of .5. I'he mean and variance of -it would be given by £(tt) = 
.5 and var(ir) = .2 5(M — Since M was very large, ef fectively the 

variance would be zero. However, as soon as the Owls came into the 
picture, the Sparrows would make their decisions in blocks (except for 
the designated buyer and seller). Since the Sparrows would distribute 
themselves uniformly among their advisers, the sample space for it 
would be il = {1/Af, (c + 1)/Af, (2c + 1 )/M, .... (nc + 1 )/M}, where n 
is the number of Wise Owls and c = (M — 2)/n, the number of clients 
per Wise Owl. If the Wise Owls made their decisions independently, 
the probability function would be binomial with n independent trials. 
I'he expected value of -tt would still be .5, but the variance would now 
be given by varfir) = (,25/n)[(M — 2)/M] 2 > 0. As n declined, the 
variance would increase/ 


7 The variance would be further increased if the Wise Owls made their decisions 
cooperatively or if they used similar information (e.g., the positions of the stars) or 
similar lorecasting techniques (e.g., the same kinds of econometric models). Professor 
Owl assumed this not to be the case. 
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Next he worked out the probability of successful prediction in any 
one period. For an Ordinary Owl (whose advice would be disre¬ 
garded), the prediction and the outcome would be independent, and 
the probability of success, P lt was obviously .5. For a Wise Owl, 
Sthough, the prediction would influence the outcome, and the proba¬ 
bility of success would depend on the total number of Wise Owls. To 
be successful a forecaster must avoid being in the majority group. A 
success would occur if the individual forecaster said buy (sell), if r — 1 
of the other forecasters also said buy (sell), and if the remaining n — r 
said sell (buy), with r s n/2. The probability of success for an individ¬ 
ual Wise Owl was therefore 

6( n) b( h ) 

P., = 2(.5)X( n r ~ 1 1 )(.5) r - , (.5)"" = (.5)''-‘ X("r ~ -5. (1) 

where 8 (h) = n/2 for n even and S(n) = (n — l)/2 for n odd. 8 

Professor Owl assumed that the total population of Owls was A', 
that each player in the Buy-and-Sell game had a memory T periods 
long, and that each player had a success criterion a, where a is de¬ 
fined as the proportion of the T periods in which a forecaster was 
successful. Letting n, be the number of Wise Owls in period /, the 
probability function for n, would depend on J V, 7", and a, but also on 
{n,_*, k — 1, 2, . . .}, since the probabilities of success for individual 
Owls would have changed over time as they moved back and forth 
between the Wise Owl and the Ordinary Owl categories and as the 
number of Wise Owls fluctuated from period to period. However, he 
noted (1) that P t would never be less than P> and would be equal to it 
for n even; (2) that, if N were relatively large and n relatively small, 
the largest proportion of the Ow! population would be Ordinary Owls 
in any given period; and (3) that, as n increased, P•> would approach 
Pi for n odd. He therefore used P = Pi = .5 as an approximation to 
the one-period success probability for every individual Owl. By over¬ 
stating slightly the average probability of success he would be over¬ 
stating the probabilities of larger values ot w and understating slightly 
the variance of tt. He was content to err a little on the conservative 
side in calculating the variance since, if anything, that would 
strengthen his conclusions. 

On the foregoing basis he determined the probability that an indi¬ 
vidual Owl would be a Wise Owl (by virtue of having been successful 


" The case in which n is even and exactly halt the Wise Owls said buy and hall said sell 
(r = n/2 ) was treated as a success (or nonfailure), the reasoning being that a forecaster s 
clients would have no incentive to change allegiance ii no other forecaster had outper¬ 
formed him. The equality would then hold in eq. (1) lor n even and the inequality lor n 
odd. 
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in no fewer than aT out of T periods, with .5 < a s 1). He found this 
to be 

T T 

T(a\T) = X ([>*(« - P) r - k = (-5) 7 X ([)• (2) 

*=S( Tp ' *-5(7')'' 7 

where 5(7') denotes the smallest integer such that 5(7') s aT. Defining 
ur(«|<V, /', a) as the probability of there being n Wise Owls (and hence 
/V - ri Ordinary Owls) in any given period, conditional on N, T, and 
a, he then wrote 


pr(n|A\ 7 , a) 


A 


[/ J (a|/')]"[ 1 - 7>(a|7)f 




1 - (.5) 



(3) 


I’hus )i was seen to he generated by the Bernoulli process B[N, 
/■ > (a|7')|, with 


E(n) = A'7 > (a|7), (4) 

var(M) = A / / , (a|7')[ 1 - /'(otlf)]. (5) 

In the special case a = 1, which implies the infallibility criterion 
used bv the Sparrows betore the Great Market Crash, the probabili¬ 
ties determined in equations (2) and (3) reduce to 

f J (117 ) = (.5)', (6) 

pr(«|A\ 7', 1) = (^)(.5)" 7 [1 - (,5) 7 ] v_ ". (7) 


Bv way of illustrating the implications of all this, Professor Owl made 
up a little tabic (table 1). He selected different sets of values for N, 7', 
and a, and for each set he calculated P(a\T), the probability that an 
Owl would he a Wise Owl, £(n), the expected number of Wise Owls, 
and .S'firlA'fa)] = {varl'tr|7.'(n)]} 1 ' 2 , the standard deviation of it evalu¬ 
ated at n — E(n). As alternative sizes of the Owl population he chose 
X = 100 and X = 1,000. As alternative memory lengths for the 
Sparrows he chose T = 0 (no memory), T = 1, 7 = 2, T = 5, and T — 
10. As alternative success criteria he chose a = 1 (infallibility), a = .8, 
and a = .6 (a relatively weak requirement since presumably a would 
never be as low as .5). 

Professor Owl used his table to illustrate that the standard deviation 
of it is a decreasing function of N and a nondecreasing function of «. 
He noted that, for some values of T, .V[tt|£(w)] could remain the same 
when a changed because P(a\T) would not change, so that introduc¬ 
ing a more severe success criterion would have no effect. For ex- 
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ample, with T = 2 an Owl would have had to have predicted correctly 
in each of the previous two periods in order to satisfy any success 
criterion such that .5 < a s 1. He noted also that, in general, 
.S’[-njL'(/i)l increases with T but that reversals are possible. For ex¬ 
ample, with N = 100 and a = . 6 , H(ct\T) is .5 for T = 1 , falls to .25 for 
T = 2, but then rises again to .5 for T = 5; as a consequence, 
■S’[ir|/i(M)] also falls and then rises. Aside from this, he found the 
tendency lor the variability of it to increase with T rather interesting. 
He was accustomed to thinking of longer memory processes as im¬ 
parting stability, but here was a case in which just the opposite was 
true, the longer the memory of the Sparrow's when they evaluated the 
Owls' track records, the more unstable was the market game. 9 

V. Are Speculators Rational When They Choose 
Advisers? 

I he foregoing is of obvious interest to ornithologists. However, econ¬ 
omists may find it interesting also, inasmuch as the game of Buy-and- 
Sell has some basic features of a purely speculative market, and the 
identification of a subset of forecasters or advisers whose views are 
given special weight seems to accord with observed practice in real 
markets. A question of particular interest is whether such practice is 
"iational." 

Advisers have no special forecasting abilities in the model, and all 
are exactly alike. A speculator who knew that every adviser had the 
same piobability of success would have no basis for anything but a 
random choice of one adviser over another. Even if he did not know 
what the success probability was, he would still know enough to pool 
the observations on all advisers in order to estimate it. But suppose 
that he did not realize that they all had the same probability of suc¬ 
cess. He would then be behaving quite rationally (conditional on what 
he did know) if he estimated the probability for each adviser by using 
only the adviser’s own track record. Let 0, be the probability of success 
for the ith adviser and let 0,(T) be the observed proportion of suc¬ 
cesses for that adviser over a memory period T. The maximum- 
likelihood estimator of 0, is (l/n)20,(T), the average success propor- 

1 Professor Owl was careful to emphasize to his readers that the calculations in his 
table weie purely illustrative. He pointed out that (for specified N and a) there was a 
nonzero probability that n would be zero in any given period, in which case the chosen 
a triterton would have to be abandoned in favor of some other criterion. He pointed 
out too that the conditional standard deviation, S'|n|£(n)], would in general be less than 
the unconditional one, A'(ir), since the conditional one ignored the fact that n was a 
random variable. However, the calculation of S(v) was sufficiently complicated that he 
did not attempt it. (Again he was satisfied to err on the conservative side in his cal¬ 
culations.) 
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tion over all advisers, under the condition 8, = 8 (t = 1.2.«), but 

it is simply 0,(T) in the absence of that condition. If changing advisers 
were costless, it would be entirely rational for an individual speculator 
to choose an adviser for whom the maximum-likelihood estimate was 
high, changing his allegiance in each period if that were necessary. (If 
there were costs involved in making a change or if the more successful 
advisers charged higher fees—a distinct possibility—this would have 
to be taken into account.) 10 

Another way of putting the matter is to say that speculators confuse 
luck with skill when they evaluate forecasting track records. When 
Fred did his calculation to see whether Harry’s record of five suc¬ 
cesses in a row could be the result merely of chance, he was making an 
inappropriate test. The fact that he was aware of Harry's success 
record was itself a result of the underlying random process that se¬ 
lected Harry as a lucky winner. Fred should have calculated the prob¬ 
ability that any of the Owls would have predicted successfully in each 
of five consecutive tries. If there were 100 Owls, he would have found 
the probability to be about .96 that at least one Owl would have 
achieved the five successes. Far from being remarkable, a record such 
as Harry’s was therefore almost a certainty in the circumstances and 
gave no reason at all to infer superior forecasting ability. 11 

VI. Further Observations 

All models ignore some aspects of reality in order to focus on others, 
and the model used in this paper is no exception. A number of wavs 
in which it differs from real-world markets may be noted briefly, 
further discussion being provided in Denton (1984). 

The model ignores the transactions in commodities or asset own¬ 
ership that underlie real markets in order to concentrate attention on 
purely speculative behavior and the role of professional advisers in 


10 A quite different line of argument is provided by the psychological theory of 
cognitive dissonance: a speculator might prefer to believe and choose to believe in the 
superiority of his adviser because this would allow him to hold greater expectations of 
profit. See Akerlof and Dickens (1982) on applications of the theory tti economics in 
general and Maital (1982, p. 227) on its application to stock market speculation. See also 
Denton (1984) for further discussion in the present context. 

' 1 There is an interesting and close analogy between bias in adviser selection and bias 
in the selection for publication of scientific results based on statistical significance tests 
Some advisers may be chosen over others because of previous loretasting success re¬ 
sulting merely from good luck; a scientific paper reporting rejection of an interesting 
null hypothesis may have a higher probability of acceptance for publication than a 
paper reporting only “negative" results, even though the rejected hypothesis may in 
fact be true and its rejection the result merely of chance. On the problem of publication 
bias and its implications, see Sterling (1959), Tullock (1959), and Denton (1985) See 
Denton (1984) for further discussion of the analogy. 
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influencing such behavior. It ignores too the presence of nonspec- 
ulators who participate in speculative markets for the purpose of 
hedging. Abstraction from real transactions and the hedging motive 
is a common device in the literature of the economics of speculation. 
(For example, Figlewski [1982, p. 90] adopts a model “in which specu¬ 
lation on price changes is the only motive for trading, and the return 
for investors as a group is zero.”) 

1 he model ignores differences in information and the skill with 
which information is processed. The evidence on the actual skills and 
performance of investors is often somewhat impressionistic. Malkiel 
(1982, p. 82) suggests that the increased role of professional investors 
has not reduced the tendency for stock prices to fluctuate. Evidence 
relating to futures markets provided by Stewart (1949) and Rockwell 
(1977), and summarized by Stein (1983), suggests that professional 
speculators do have better-than-average forecasting abilities. On the 
whole, there seems to be considerable uncertainty as to the extent of 
skill differences. 

All investors are assumed to invest the same amount in any given 
period in the model, and this is clearly unrealistic. However, allow¬ 
ance for large buyers and sellers would simply increase the tendency 
toward block behavior and, hence, the degree of market instability. 

Another element of unreality is the assumption that advisers do not 
themselves speculate. Again, though, that is not a harmful assump¬ 
tion. If an agency makes a forecast and takes its own advice, it can be 
viewed as it it were both an adviser and a client. If the agency is a large 
investor, and hence a large client of itself, the effect will be the same 
as if it were simply an adviser to a large number of small clients. If the 
agency’s forecasts are wrong, it may fire its chief forecaster and turn 
to someone with a better track record. 

The model's assumption that all advice in a speculative market is no 
better than the toss of a coin might be viewed as extreme. However, 
given the demonstrably very large random element in speculative 
markets, it is clear that correct forecasting in any given period must be 
attributable to luck, at least in large measure if not in total. 

An alternative model that allowed for information or skill differ¬ 
ences among advisers might be considered. One question is to what 
extent such differences could be discerned by potential clients, given 
the noisy nature of the markets. (Perhaps one should assume that 
some clients have better information than others about the informa¬ 
tion the advisers have—a second-order information advantage, so to 
speak. Advantages of higher order are also possible.) Other questions 
have to do with the consequences of such discernment. Suppose, for 
example, that the set of advisers consists of two subsets, A and B, and 
that the members of A have information not available to the members 
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of B. Suddenly all clients come to realize this and start to take advice 
only from the A advisers. What are the results? First, the effective 
number of advisers is reduced. The degree of block behavior is there¬ 
fore increased and, hence, the tendency toward price variability. Sec¬ 
ond, the A advisers (and their clients) have lost the information ad¬ 
vantage they previously had, for (in effect) only A advisers remain. 
All remaining advisers are thus now on the same footing, and the 
coin-tossing assumption is again valid. The very success of the A 
advisers has caused their downfall. These are interesting questions, 
but further consideration of them would take us beyond the aims of 
the present paper. 

The model assumes, for simplicity, that each speculator chooses a 
single adviser and takes his advice without question. In practice, 
speculators may well consider the recommendations of several market 
analysts before reaching a decision, and at first glance one might 
think that this would impart greater stability by reducing the weight 
given to any single forecast. However, a moment’s reflection suggests 
that just the opposite may be true. Suppose, for example, that each 
speculator in a given market based his decision on the average advice 
of all forecasters. If 60 percent of the forecasters said “buy” and 40 
percent said "sell,” then he would sell (and perhaps congratulate him¬ 
self on taking such a careful and reasonable approach). But every 
other speculator would be trying to sell also, and the price would 
collapse. In the context of the model, over any sequence of periods 
the price would oscillate wildly between extremes, and the market 
would have the tnaxitnum degree of instability because all speculators 
would in effect (and of course unwittingly) be acting as a single block. 
Moreover, the choice of a weighting function to apply to the individ¬ 
ual forecasters is irrelevant: any arbitrary function will produce the 
same result as long as all speculators use it (e.g., a function in which 
forecasters are weighted in accordance with their track records). 
More realistically, if the functions differ among speculators but are 
closely related, there will be a strong though less than perfect correla¬ 
tion among market decisions and, hence, a high though less than 
extreme degree of price instability. Only if the weighting functions 
are uncorrelated will the use of common information (forecasts) im¬ 
part no tendency toward block behavior. 

VII. Summary 

The motivation for this paper was the apparently anomalous charac¬ 
ter of situations in which participants in speculative markets place 
heavy reliance on relatively small numbers of professional forecasters 
(advisers) when theoretical considerations and the mass of objective 
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statistical evidence indicate that such markets are dominated by ran¬ 
dom fluctuations. A highly stylized model of a pure speculative mar¬ 
ket and an associated process of adviser selection was used to show 
how this might come about if speculators choose advisers on the basis 
of observed track records and how that would tend to encourage 
block behavior and increase the instability of the market. The model 
was employed initially to “tell a story” and then developed formally to 
allow specifically for the effects of size of forecaster population, 
length of memory of speculators, and speculators’ criterion as to what 
constitutes an acceptable track record. The instability of the model 
market was seen to be a decreasing function of the forecaster popula¬ 
tion. a nondecreasing function of the required track record success 
rate, and a generally increasing function of the length of memory 
(although reversals with increasing memory were seen to be possible). 
It was argued that inferences based on individual track records may 
represent a confusion of luck with skill or, in more formal terms, an 
inappropriate choice of null hypothesis to be tested statistically. It was 
argued also that whether market participants are behaving “ration¬ 
ally" when they base their choices of advisers on individual track 
records depends on how much information they are assumed to have 
about the underlying market processes. 

Some features of real-world markets ignored in the model were 
noted. It was argued that taking account of these features would not 
vitiate (be basic aigument that similarities in the assessment of advice 
lead to block behavior and price instability, even though each 
speculator may think he is acting independently. 


References 

Akeilof, George A., and Dickens, William T. “The Economic Consequences 
of Cognitive Dissonance ” A.E.R. 72 (June 1982): 307—19. 

Demon, Frank T. “Professional Counselling as a Destabilizing Influence on 
Speculative Markets." Res, Rep. no. 100. Hamilton: McMaster Univ., Pro¬ 
gram Quantitative Studies F.con. and Population, June 1984. 

-. “Data Mining as an Industry.” Rev. Econ. and Stalls. 67 (February 

1985): 124-27. 

Figlewski, Stephen. “Information Diversity and Market Behavior ." J. Finance 
37 (March 1982): 87-102. 

"Kaufman Triggers the Latest Stampede." Maclean's (August 30, 1982). 

Maital, Shlomo. Minds, Markets, and Money: Psychological Foundations of Eco¬ 
nomic Behavior. New York: Basic, 1982, 

Malkiel, Burton G. Winning Investment Strategies. New York: Norton, 1982. 

Rockwell, C. S. “Normal Backwardation, Forecasting, and the Returns to 
Commodity Futures Traders." In Selected Writings on Futures Markets, edited 
by Anne Peck. Chicago: Board of Trade, 1977. 

Stein, Jerome L. “Real Effects of Futures Speculation.” Working Paper no. 
83-2. Providence, R.I.: Brown Univ., Dept. Econ., March 1983. 



SPECULATIVE MARKET 


993 


Sterling, Theodore D. “Publication Decisions and Their Possible Effects on 
Inferences Drawn from Tests of Significance—or Vice Versa J. American 
Stalls. Assoc. 54 (March 1959): 30-34. 

Stewart, B. “An Analysis of Speculative Trading in Crain Futures." Tech. 
Bull. no. 1001. Washington: U.S. Dept. Agriculture, Commodity Exchange 
Authority, October 1949. 

Tirole.Jean. “On the Possibility of Speculation under Rational Expectations,” 
Economelrica 50 (September 1982): 1163—81. 

Tullock, Gordon. “Public Decisions and Tests of Significance: A Comment.” 

J. American Statis. Assoc. 54 (September 1959): 593. 

"Wall Street’s Wildest Week.” Newsweek (August 30, 1982). 



The Effect of Labor Unions on Investment 
in Training: A Dynamic Model 


Yoram Weiss 

'Irl-Avtv ( immu/i and Stair I’nmrruty of New Yolk at Stony Biotik 


Hus papci (tinsiders the determination ol training requirements 
imposed by senior incumbent workers on new entrants. The main 
question is whether organized senior workers will set training stan- 
(l.uds abuse the competitive level, reflecting an attempt to discour¬ 
age entry. It is shown that in the presence of effective intergenera- 
tional transfers, a union dominated by senior workers will require 
statidatds that conform with the competitive standards in the long 
run In the short run, however, overtraining will be required. If the 
intcrgenerational transfers are t oust rained, then a second-best situa¬ 
tion arises and overtraining is required even m the long run. 


I. Introduction 

Control over the amount of training required of new entrants is a 
potentially important tool available to many craft unions and profes¬ 
sional associations. It is often exercised through licensing based on 
schooling and apprenticeship requirements and on passing special 
examinations. The standards are to a large extent determined and 
monitored by the incumbent practitioners. Thus physicians are re¬ 
quired to train in various forms of schooling, an internship, and a 
i esideiu v for up to 8 years. Architects need 5 years of schooling and If 
vears of appienticeship. Accountants require 4 years of college and 1 
year ol apprenticeship. Less widely known are the schooling and ap- 
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prenticeships of barbers, plumbers, and mine foremen, which require 
up to 5 or 6 years of investment. Finally, most craft unions require 
apprenticeship periods of 2—4 years (see Kolberg 1976). 

Two questions arise: Would such long periods of training be ob¬ 
served also under competitive conditions, where each worker is free 
to choose his own level of training? What prevents the senior workers 
from imposing very severe entry standards? The conventional view is 
summarized by Rottenberg (1980, p. 9): “Practitioners in licensed 
occupations are given an advantage if the standards and costs of entry 
are made higher than they were when current practitioners entered 
the relevant occupation. Practitioners have an incentive to promote 
continuously higher standards and costs of entry into licensed oc¬ 
cupations.” 

In this paper 1 show that the impact of labor unions on training 
depends on the organization of the union and its power to control 
other variables in addition to training. I consider a union consisting of 
two overlapping generations—young and old. The old workers have 
the power to control training and to impose an entry fee. It is shown 
that if the intergenerational transfers that incumbents can impose on 
the new entrants are unconstrained, then unions will select the 
efficient (i.e., competitive) level of training in the long run. Only in 
the short run will the training requirement exceed the efficient level. 
If, however, there are effective constraints on intergenerational trans¬ 
fers, then overtraining will emerge even in the long run. The general 
conclusion is that overtraining, when it arises, reflects a second-best 
solution to the union. The first-best solution for the union is to restrict 
entry via the exclusive use of entry fees. The extent of overtraining 
will therefore depend on the arrangements within the union that are 
designed to transfer resources from new entrants to incumbents. 

The evidence suggests that entry fees in most unions are small. 1 
There are, however, substantial contributions into pension funds that 
are not immediately vested and several other fringe benefits that are 


1 A survey conducted in 1974 shows that “of the 89 unions with initiation fees . . . , 
45—representing 74 percent of the membership . . . —charged less than $40. Major 
unions charging between $20 and $39 include the Auto Workers; Stale, County and 
Municipal Employees; Clothing Workers; and Papeyworkers. The 29 unions (20 per¬ 
cent of membership) reporting payments of $t00 or more tended to represent small 
numbers ot specialized professional workers or craftworkers, such as the Elevator 
Constructors, Directors Guild. Football Players, and Radio Association (which charged 
$2,000) A few large unions, such as the Mine Workers and Iron Workers, also charged 
some new members $100 or more” (Hickman 1977, p. 20). These small fees puzzled 
economists, e g., Bectoer (1959). Marlin (1980, p. 113) suggests that "the non propri¬ 
etary status of union membership precludes personal claims to revenues generated 
front the sale of new membership," and therefore direct transfers are replaced by 
various seniority-related wages, hours, layoff, regressive dues, and discriminatory 
benefit programs. 
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related to seniority (e.g.. paid vacations) and thus serve as a tool to 
impose intergenerational transfers. 2 On the other hand, unions are 
known to flatten the wage profile (see Freeman and Medoff 1981). It 
is not clear what is the magnitude of the net intergenerational trans¬ 
fer (fringe benefits and wages) caused by unionization. In some 
licensed occupations such as medicine or accounting, no direct mech¬ 
anism seems to exist for intergenerational transfers. Instead during 
(he training period (residency and internship) junior workers pro¬ 
duce services from which senior workers benefit. The size of the 
transfer implicit in this arrangement is substantial (see Marder and 
Hough 19811). 

1 he available evidence on the effects of unionization on training is 
scant. In lecem papers Duncan and Stafford (1980) and Mincer 
(1981) provide evidence that union members learn fewer skills on the 
job that could lead to a better job or to a promotion. These studies do 
not exclude, of course, the possibility that new entrants are selected 
on the basis of prior training (see Mincer 1981, p. 42), suggesting 
higher training requirements. Indeed there are several studies indi¬ 
cating that, oilier things being equal, workers in unionized industries 
tend to have more human capital (see Freeman and Medoff 1981). 
Finally, several instances have been documented in which unioniza¬ 
tion, or the establishment of a professional association, resulted in 
increased training standards (see Friedman and Ku/.nets 1945; 
Denham 1980). To the best of my knowledge the interaction between 
ttabling requirements and the use of instruments to tax new entrants, 
sue It as vested pension funds and entry fees, has not been studied. 

II. A Dynamic Model of the Labor Union 

Consider an economy in discrete time in which workers live for 2 
periods. Let there be a labor union consisting in each period of old 
and young workers. Assume that the old workers can impose a train¬ 
ing requirement tor young workers and charge an entry tee. Young 
workers can decide whether to join the union under these conditions. 

Each young worker enters with one efficiency unit. Training affects 
the number of efficiency units the worker will have in the second 
period of his life. L.et K, denote the portion of time spent in training by 


" Dunng the period J1167—72 unionized hi ms in manufacturing allocated 3.3 percent 
ol toral compensation pel man-hour into pension funds (excluding social security). In 
nonunion establishments the percentage was I 7 Paid vacations (hat are strictly senior- 
U\ related accounted tor 4 1 and 2 7 percent, respectively (see Freeman 1981). Alpert 
(1982) finds that the fringe benefit-wage ratio is increasing with age and with a positive 
interaction with union status. This holds for each of the components, health and life 
instil ante, pensions, and vacations, suggesting inlergeneranonal transfers within 
unions. 
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a worker in period t, 0 k, *£ 1; then the worker will have P(k,) 
efficiency units in period t + 1. The learning function P( ) is as¬ 
sumed to be twice differentiable, monotone increasing, and strictly 
concave. Thus P'( ) > 0, P"( ) < 0, and P(0) = 1. 

The number of young workers entering at period t is denoted by N h 
N, 2= 0. The number of senior workers actually employed is L„ where 
L t *£ N t - 1 . I assume that quits do not occur. 5 

Old and young workers are perfect substitutes in production, 1 * * 4 5 and 
the amount of efficiency units available to the industry is 

E, = P(K-i)L t + (1 - (1) 

The union determines the total supplies of young and old workers. 
The industry is competitive and firms compete for the available sup¬ 
plies. Wages are determined so as to clear the markets in each period. 
The relative wages of young and old are fully determined by their 
relative efficiency, and the wage per efficiency unit, W,, is equated to 
the value of the marginal productivity of workers.'’ Thus 

W, = F(E,), (2) 

where F( ) is the inverse industrywide demand function for labor 
and is assumed to be positive, decreasing, and twice differentiable. 
Let us further assume that the wage bill R (E, ) = E,F{E,) is bounded 
above and strictly concave, and that labor is essential to the industry; 
that is, lim £ _o F(E) = °°. 

The supply of young workers to the union is perfectly elastic at the 
level of lifetime earnings that can be obtained outside the unionized 
industry, yo-*’ The lifetime earnings that a new entrant expects if he 
enters the union depend on the entry conditions imposed by the 
current senior workers and on the option value of becoming a senior 


1 If the income per senior member exceeds the alternative productive use ol the 

senior workers' time, there will be no desire to quit. One can imagine, however, initial 
conditions in which there are so many senior workers that quitting becomes attractive 
To make the model strictly correct 1 have to assume either that such a situation does not 
arise or that the net productivity outside the industry (including moving costs) is zero. 
The latter assumption will also guarantee that the union does not “send” some of its 
senior workers into other industries when it wants to restrict their supply. 

1 This simplifying assumption is maintained in order to sharpen the question raised 

in the Introduction, Why are the young workers admitted into the union? Clearly, a 
cheap answer would be to assume that both young and old are required for the produc¬ 
tion process. I intentionally avoid this assumption, emphasizing instead the financial 
motive on the part of the senior workers, 

5 The assumption tfiat the union can select points only on the demand lunclton 
implies the absence of bargaining between the union and the employers. In a more 
general model one could allow a more active role for the employers. 

''There is no difficulty in modifying this assumption to the case of an increasing 
supply function. Such a modification may be required if some pretraining or specihc 
skills are required in order to work in the unionized industry. 
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worker next period. In each period the income of senior workers 
consists of their wage fund, L,P(k i) W„ and the taxes collected from 
new entrants, T,N,. I assume that the per capita tax, T h imposed on 
new entrants cannot exceed some prespecified value T that is 

r, « T 0 . (3) 

This constraint represents minimum wage laws or other forms of 
intervention that restrict the monopoly power of senior workers. 

For N, i > (), let V (\,_ i, N,-. i) denote the maximal collective income 
of the current senior workers assuming that all future generations 
choose their controls optimally. It is assumed that the option for 
unionization extends indefinitely into the future. For N, > 0, we can 
define the expected net earnings of each junior worker to he 

y, = -T, + (1 - X ,)W, + -j-J— V(\„ N t ) (4) 

where r is the market rate of interest. We ran now specify the supply 
constraint facing the senior workers as 

y, 55 Vo for all A', > 0. (5) 

For N,- i > 0, V(X,_ i, N, _ |) is defined recursively by 

F(X,. 1; A,_|) = max [L,P(\,_,) W, + 7,N,] (6) 

subject to the equalities (1), (2), and (4) and the inequalities (3), (5), 
and (7) below: 

0«X,«1. /,, (7) 

The statement of the problem can be simplified substantially if one 
can ensure that senior workers always allow in some junior workers. A 
sufficient condition is provided in proposition 1. 

Proposition 1. If /?[/-*(!)]/(1 + r) > then the optimal policy of 
the senior workers is to allow in some junior workers. 

Proof. Suppose JV,~ i > 0 and N, = 0 for some t. In this case the 
income of the current seniors is max /|S yv, , i)F[P(\i~ i )L,]}. 

Now consider an alternative feasible program of setting X, = 1 and N, 
= 1. This policy provides the same current income and potential tax 
revenue to the senior workers. By (4) and (5) the maximal tax revenue 
is min{7o, [1/(1 + r)]V(l, 1) — y 0 }. From the recursive definition (6), it 
is clear that V'(l, 1) 5= F’(l)/ , '[P(l)j. Hence given the hypothesis of the 
theorem the tax revenue will be positive, and the assumed policy with 
A r , = 0 cannot be optimal. 

Given our assumption that labor is essential to the industry, the 
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required condition that a single worker with the maximal number of 
efficiency units can enjoy positive rent is rather mild. The result in 
proposition 1 can be easily explained. From the point of view of 
senior workers the admission of junior workers yields two opposing 
effects: it reduces the wages of senior workers and broadens the tax 
base. By allowing young workers in but forcing them to train at the 
highest possible rate, X = 1, the negative effect on wages can be 
eliminated while the taxing power over future generations is retained. 
As long as a positive tax can be collected some junior workers will be 
allowed in. Thus despite the assumption that junior workers are not 
essential for the production process in any given period the genera¬ 
tional chain continues. 

Let (jl, and i \i, denote the shadow prices for the supply constraints of 
old and young workers, respectively, as specified in (5) and (7) above. 
The sequences L,, X,, T,, which correspond to the recursive max¬ 

imization in (6), must satisfy the following first-order conditions: 


P(\, _,)[F(E,) + L,P(K-,)F'(E,) + (1 - X,)4»,F'(E,)] - n, 


L,P(K- i)F'(E,)(\ - X,) + T, + *,j(l - K) 2 F’(E t ) 

1 1 


4 - 


14 - r N, 


dV ... 

<X„ N,) - 


[dN, 


« = »• 


0 , 

( 8 ) 

( 9 ) 


+ 4 «, 

- (1 - K,)N,F'(E, 
1 1 dV 


■FIE,) 


+ 


I + r N, dX, 


(X,. N,) 


^ 0 if X, = 0 
= 0 if 0 < X, < 1 
& 0 if X, = 1 


( 10 ) 


N, — i|i, & 0 with equality if T, < T f> . 

In addition we have the complementary slackness conditions: 
p., 2* 0 with equality if L, < N,~\, 
ij; f 5* 0 with equality if — T, + (1 - \,)F(E,) 

Finally, differentiating (6) with respect to the initial state variables, 
using the envelope relation, we have 

dV 

*N,-i 


(11) 

( 12 ) 
(13) 


* 


= M* 


(14) 
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dV 

ax,_ i 


L t P i)[F(E,) + L,P x )F'(E t ) 
+ >1» ( (1 - K)F'(E,)]. 


(15) 


Shifting the time index in (14) and (15) and using (5), we can elimi¬ 
nate the value function and its derivatives from equations (8)—(10) to 
obtain a system of difference equations in the variables L„ N,, k,, T„ 
and the shadow prices p, and tjt,. 

If the constraint (3) on intergenerational transfers is not binding 
then ih/ = N, and the system (8)-(10) is reduced (at an interior solu¬ 
tion) to the following system: 

MR, St 0 with equality if L, < JV,_ t , (16) 

(1 - K,)MR, + = y 0 , (17) 

~N,MR, + L, + 1 =0, (18) 

where we denote MR, = (d/dE,)[F(E,)E,]. 

The first-order conditions (I6)-(18) also arise in a planning prob¬ 
lem where the infinite discounted sum of rents, E,F(E,) - iV,y () , is 
maximized. Thus the maximization by each generation of senior 
woikers of its own income is equivalent to the joint maximization of 
the present value of all future rents. This is to be expected since 
without an effective constraint on the entry fees the current genera¬ 
tion of senior workers absorbs all current and future rents. Thus the 
model induces rather naturally an objective function for the union. 
The marginal revenues at each period play the role of prices in the 
maximization of this objective. Specifically, condition (16) slates that 
senior workers will not lay off their own members unless the “price” 
ol efficiency unit MR, is zero. Condition (17) states that new workers 
will be allowed to enter as long as their contribution to the present 
value of total earnings in the current and subsequent periods exceeds 
their opportunity costs. The senior workers “purchase" young work¬ 
ers at a supply price yc and sell their efficiency units at the prices MR, 
and MR ,, , during the current and subsequent periods, respectively. 
Condition (18) states that training will be determined so as to max¬ 
imize the lifetime output of each worker given the prices MR, and 
MR,+ i. Thus the marginal cost in terms of the forgone current wage 
bill is equated to the marginal benefit in terms of a future increase in 
the wage bill. Because of the power to tax future generations the 
future increase in the earning capacity of the currently young is ac¬ 
counted as a benefit by the current generation of senior workers. For 
the same reason the loss of current earning by the trainees is ac¬ 
counted as a cost by the senior workers. 
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If the constraint on inter generational transfers is binding then v|», < 
N t , and the system (8)—(10) is reduced (at an interior solution) to 

MR, + (t| i, — N,)( I — k,)F'{E,) S 0 with equality if L, < N,_,, 

(19) 

(l - K)[mR, + (^- - 1 )L f P(\,_,)/■''(£',) 


+ + (*, + 1 -N, + ,)( 1 

1 -f r 


X, + ,)F'(£ I+ ,)] (20) 


= >o 


+ T, 


o(l- 



-MR, + (l - ~-y,P(K-i)F'(E') + £JhiL[MR lt] 

( 21 ) 

+ (*, + i ~ Af, + ,)(1 - k t+ \)F'(.E,+ ,)] ~~ = 0. 

The difference between the systems (19)-(21) and (16)-(18) 
reflects the deadweight loss imposed by the constraint on intergenera- 
tional transfers, (3). When the constraint (3) is binding, any gain (or 
loss) to the young workers is discounted by the senior workers at a 
rate ^,/N„ indicating that an attempt to absorb these gains requires a 
change in the basic variables of the system (i.e., k„ N t , and L,) and will 
therefore reduce the total available pie. The implications of these 
costs of redistribution are as follows. Comparing (19) and (16), it is 
seen that the value associated with the employment of a senior worker 
is increased because the costs of wage reduction imposed on young 
workers are not fully accounted as a cost by senior workers. Compar¬ 
ing (20) and (17), it is seen that the value of employing a young 
worker is reduced in both the current and subsequent periods, since 
his earnings are not fully accounted as income by the senior workers. 
This reduction is relatively sharper in the present period since the 
senior workers still give full weight to the wage reduction imposed on 
themselves by the employment of a young worker. The same tilt in 
relative prices toward the future is noticed in the comparison of (21) 
with (18). The imposition of an effective constraint on intergenera- 
tional transfers may also cause the supply constraint (5) to become 
ineffective.This occurs only in the extreme case where training is set 
at the maximumdevel, that is, = 1. In this case the union has no 
taxing instruments left and will have to resort to queuing. 

We can now describe the steady-state solution for the system (8)- 
(9). If the constraint (3) is not binding in the steady state, then the 



1002 


JOURNAL OF POLITICAL ECONOMY 


unique (interior) steady state that satisfies equations (16)-(18) is given 
by 

= 1 + r, (22) 


MK(E*)\ 1 


- X* + 


P(W 

1 + r 


■yo- 


(23) 


l\ing equations (22) and (23) we can solve for the implied values of 
X* and 7 *; that is. 


N* — 


E* 

1 - X* + P(k*) 


(24) 


and 


7* = -L±-^o 


F(E*) 


1 < T 0 . 


(25) 


MR (E *) 

1 lie t ompatable competitive steady-state solution is given by 

P '(X') =l+i, (26) 


F(E')[ 1 " *' + T7T] = > <27) 

Iht* (ompetitive solution has the property that the lifetime earn¬ 
ings of each worker are maximized given the steady-state wage, and in 
addition the maximized lifetime income (output) of each worker is 
equated to his opportunity cost. Thus the competitive solution is 
efficient In contrast, the monopoly solution implies that the value of 
the worker’s lifetime output exceeds his opportunity cost y () . In this 
sense the allocation is inefficient. The level of training, however, is set 
at the efficient level, given the monopoly, steady-state, wages. We thus 
conclude the following: 

Proposition 2. If the constraint on intergenerational transfers is 
not binding, then the steady-state level of training chosen by the 
union is the competitive one, and no senior worker is laid off at the 
steady state. 

Proposition 2 states that the only lasting ef fect of the union is the 
restriction on the supply of efficiency units. Given this “output,” 
which the union “sells," inputs are chosen ef ficiently so as to minimize 
the resource costs required to attain them. An inefficient level of 
training would imply that the lifetime earnings of a young worker 
could be increased without affecting the wage. Since senior workers 
can appropriate all the resulting gains by taxing new entrants, such 
inefficiencies will not arise. 

The level of intergenerational transfer implied by formula (25) can 
be substantial. Setting the intergenerational interest rate at 1 percent 
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(a real interest rate of 2 percent per annum over a generation of 35 
years) and the relative wage effect of the union, F(E*)IMR(E*) — 1 
= (W*/W r ) — 1, to 10-15 percent, as indicated by most studies, one 
gets that T* is 20-30 percent of the lifetime earnings in the alterna¬ 
tive occupation, y 0 - It is doubtful whether, in practice, the net seniority 
premium (in wages and fringes combined) is sufficient to impose such 
a transfer. Consider, therefore, the case in which T* as given by (25) 
exceeds T 0 . Then the inequality in (25) cannot be maintained and the 
constraint (3) must be binding in the steady state. Hence, equations 
(19)—(21) must be applied. Proposition 3 below follows immediately 
from (21). 

Proposition 3. It the constraint on intergenerational transfers is 
binding in the steady state, then the level of training required by the 
union will exceed the competitive (and efficient) level. 

The cause for the difference between propositions 2 and 3 is that 
with a binding constraint on intergenerational transfers the senior 
workers have an added incentive to induce entry, namely, to broaden 
their tax base. A higher training requirement is imposed in order to 
mitigate the negative effect on current wages. 

It is important to note that even if intergenerational transfers can 
be freely imposed, there will be a union effect on the investment in 
training outside the steady state. Generally speaking, the training re¬ 
quirement at period t, is determined by the ratio MR,+ ,/MR, under 
monopoly and by the ratio W,+ ,/W, under competition. At the respec¬ 
tive steady states, these ratios are both equal to 1, and there is thus no 
union effect. Phis neutrality need not hold outside the steady state 
since the respective price ratios, MR, + ,IMR, and VT,+ ,/W„ may follow 
different time paths under monopoly and competitive situations. 
While it is difficult to compare the whole time path of training under 
competition and monopoly, one can compare the first element of each 
of the two sequences. Consider, for instance, the case in which a 
competitive industry (in the steady state) becomes unionized at time 0. 
The initial state of the union is such that there are too many senior 
workers (i.e., AT > N*). By laying off some of the senior workers it is 
feasible to adopt immediately the union’s steady-state policy, setting 
X, = X*, N, = N*, and L, = N* for t = 0, 1,2,.... Clearly for such a 
policy, MR 0 = MR* > 0. But then, as seen from condition (16), 
optimality is not attained and the union can gain by increasing L 0t 
holding all other variables the same. Such an adjustment will reduce 
MR a and call for further adjustments. In particular, since MRo has 
now been reduced below the future values of MR, there will be an 
incentive to increase X 0 above X*. Hence, the immediate impact of 
unionization will be to increase the level of investment in training 
even when there is no long-run effect. 
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An interesting aspect of the behavior of the system outside the 
steady state is that prices and quantities move in cycles. A period with 
relatively stringent standards is followed by a more relaxed one. This 
is seen, for instance, in equation (18), where it is assumed that the 
constraint (3) is not binding. Since each junior worker is retained for 2 
periods and is paid a fixed amount in present value terms, yo, any 
change that leads to a current reduction in MR, must be offset by a 
corresponding increase in MR, + j. 

The dynamic adjustments outside the steady state reflect the con¬ 
straint that the senior workers who are not employed cannot be gain¬ 
fully employed elsewhere. If one assumes that the union can sell the 
efficiency units of its laid-off senior workers at a price W' = MR{E*) 
then it is possible to “jump” to the steady state immediately from any 
initial conditions. In most cases training is, at least partially, industry 
specific and such an option is not feasible. Therefore the existing 
stock of senior workers and their efficiency units will affect the de¬ 
mand for new entrants by the incumbents. If the current number of 
senior workers is large, then relatively few new young workers are 
necessary to generate the desired supply of efficiency units, and cycles 
arise. A condition for a local stability of the price sequence {MR,} and 
the associated quantity sequence {E,} is 

>' + r. (28) 

I he stability condition (28) has the interesting implication that the 
lifetime (net) productivity of each worker increases, at the steady 
state, at a rate that exceeds the interest rate. The tilt in productivity 
toward the second period of the worker’s life allows for a smaller 
adjustment in the future price MR, + i as MR, changes. 

As an illustration of the applicability of the analysis consider the 
medical profession in the United States, which is reputedly run by a 
labor monopoly. The available evidence suggests that the incumbents 
are able to absorb all available rents and, as a consequence, “physi¬ 
cians have received only small rents during equilibrium periods” (see 
Leffler 1978, p. 185). The transfers take the form of apprenticeship 
wages to residents and interns that are below the value of their ser¬ 
vices. The annual value of these transfers was estimated in 1979 to be 
$20,542 per annum (see Marder and Hough 1983). In this case the 
transfers are tied to the duration of the.training period and are con¬ 
strained by the minimum wage that must be paid to residents. One 
would therefore expect that overtraining would arise. Friedman and 
Kuznets (1945, chap. 1) document the sharp increase in standards 
and the reduction in the number of schools and medical students 
following the establishment of the American Medical Association in 
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1904. The number of admissions later increased, stabilizing the sys¬ 
tem at a lower number of physicians per capita (see Lewis 1963, pp. 
114-35). The admission standards have also stabilized but at a higher 
level than the initial one. The long-run impact and the cyclical pattern 
of adjustment correspond to the predictions of the model presented 
in this paper. 

III. Conclusions 

The paradigm of overlapping generations seems to provide a flexible 
tool for the analysis of dynamic models of union behavior. The basic 
insight of this approach is that the conflict between senior and junior 
workers that arises from their substitutability in production cannot be 
resolved without some expectations about the future in which different 
agents will also be involved. Thus, requiring high training standards 
may be beneficial to the incumbents, but they must also concern them¬ 
selves with the future impact on the earnings of the trainees, since this 
will affect the rents they can extract from the young. The future 
impact is partially dependent on the training of junior workers, con¬ 
trolled by the current union, and partially on the decisions of the 
f uture generations of senior workers. Assuming that expectations are 
formed rationally, that is, that each generation takes into account the 
correct mode of behavior for future generations, the analysis can be 
reduced to a dynamic programming problem. In particular, the case 
in which transfers can be effectively imposed can be reduced to an 
almost standard optimal growth model. 

In describing the solutions, two basic distinctions were made: bind¬ 
ing versus nonbinding constraints on intergenerational transfers and 
long-run versus short-run effects. In the long run the training re¬ 
quirements set by the union will be efficient if direct transfers can be 
freely imposed. In other words, training requirements will not be 
used to regulate entry if a more efficient tool, such as an entry fee, is 
available. A lasting long-run positive effect on training arises only 
when the union cannot impose effective intergenerational transfers. 
Even in the absence of such constraints, however, the initial impact of 
unionization is to raise the training standards above the efficient level. 
This initial impact is followed by a series of cyclical adjustments in 
which training standards are raised and reduced alternately. 

The simple model presented here can be extended in several ways. 
Rather than assume that the senior workers absorb all the rents, 
either directly by transfers or indirectly by training requirements, one 
can allow the young workers to organize in an attempt to retain some 
part of the rents. If the bargaining over the division of rents is re¬ 
solved by a Nash solution, then it can be shown that overtraining 
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arises in the steady state (see Weiss 1984). The reason, in this case, for 
overtraining is the positive impact of training on the future bargain¬ 
ing power of the young. The inefficiency reflects the absence of bar¬ 
gaining among generations that do not coexist and is in this sense akin 
to the second-best situation discussed in this paper. Another natural 
extension is to incorporate explicitly the services the young provide to 
the old as part of the apprenticeship relationship. This has the effect 
of changing the definition of the efficient level of training, but the 
results arc similar to those presented here: overinvestment in training 
occurs when intergenerational transfers, that is, the gap between the 
apprenticeship wage and the value of services performed, are effec¬ 
tively constrained (see Weiss 1984). 

The real-life manifestations of overtraining are varied: heavy, and 
apparently irrelevant, schooling requirements; long waiting times for 
board examinations with inexplicably high failure rates; and long 
apprenticeship periods. It is important to realize that even though 
training time is not completely wasted—that is, workers do indeed 
improve—overtraining reduces the efficiency of human capital. The 
point is that over a whole lifetime, a worker is not used in the best 
possible way While much attention has been given to the possible 
adverse effects of unions on the use of technology and capital, their 
impact on human capital is often ignored. It is not inconceivable that 
empirical evidence will show the union effects on investment in hu¬ 
man capital to be quite important. 
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Comments 


Interest Rate Variability and Economic 
Performance: Further Evidence 

John A. Tatom 

Fednal Reserve Bank of St Louis 


Evans (1984) argues that a policy-induced rise in unanticipated inter¬ 
est tate volatility in 1980-82 substantially and significantly reduced 
output in the U S. economy. He concludes that “stabilizing interest 
rates is probably sensible monetary policy” (p. 204). A second hy¬ 
pothesis, that increased volatility of money growth has contributed to 
stagnation, is rejected by Evans. 1 

This comment raises some questions about the strength of Evans’s 
conclusions and about the implicit channels of influence for his re¬ 
sults Evans's conclusion that only past unanticipated volatility affects 
output is theoretically and empirically weak. Anticipated changes in 
volatility are shown to have similar effects on output. Second, Evans 
implies that in Barro’s (1981) output model the channel of the interest 
rate volatility effect on the economy is through an unanticipated de¬ 
cline in aggregate demand. The price level effect of such an unantic¬ 
ipated reduction in aggregate demand is tested in Barro’s price equa¬ 
tion. In examining these issues, an alternative measure of money 


I wish to thank Paul Evans, Allan Melt7er, Robert Rasche, and an anonymous referee 
lor helpful comments on earlier drafts of this comment and Tom Gregory for his 
exrellem research assistance. The views expressed here are not necessarily shared by 
the Federal Reserve Bank of St. Louis or the Board of Governors of the Federal 
Reserve System. 

1 Surprisingly, even if it did have such an effect, his chan of volatility (hg. 1, p. 211) 
indicates that it would have played no role in the 1980—82 downturn because it was 
essentially unchanged from 1977 to 1982, roughly equaling its postwar average. 
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growth variability is used. This money growth measure is closer in 
spirit to the one Evans uses for the interest rate and to the appropri¬ 
ate measure of real risk arising from monetary policy actions. 

The results indicate that (1) it is not possible using Evans’s model to 
disentangle whether only unanticipated interest rate variability mat¬ 
ters; (2) one cannot reject the hypothesis that interest volatility in 
1980—82 rose, in part, because of an increase in the volatility of 
money growth; and (3) risk changes have dominant aggregate supply 
effects. The latter is in sharp contrast to the implied demand shock in 
Gertler and Grinols (1982), Mascaro and Meltzer (1983), and Evans 
(1984). 

Evans’s policy conclusions are not emphasized because he provides 
no evidence that the recent increased volatility of interest rates was a 
policy choice or that the choice of an interest rate targeting procedure 
would have lessened interest rate volatility. 2 3 The results below, how¬ 
ever, suggest that improved monetary control stabilizes interest rates, 
prices, and output. 


Measures of Volatility 

Evans examines the effects of changes in the annual volatility of 
monthly changes in the Aaa bond yield. Specifically, he defines VR, to 
be [Via S 1 , 2 ,, (Ar, - Ar,)] 1 ' 2 , where Ar is the change in the bond yield in 
month i of year t and 5r is the average monthly change in year t . 4 
Evans’s measure of the volatility of money growth is VM„ which mea¬ 
sures the “unpredictability of money growth” as the 6-year moving 
standard deviation of unanticipated annual money growth. 

A measure of intrayear variability such as that of quarterly money 
growth looks more like Evans’s volatility of interest rales (his fig. 2, p. 
212). For example, in my figure 1 the volatility of annual money 
growth (6-year moving standard deviation of annual money growth) 


2 Walsh (1984) shows that, in contrast to an announcement, a change in the policy 
rule used by the Fed away from interest rate smoothing can lead to greater volatility o( 
both interest rates and the money supply. Risk in his model is wholly financial, how¬ 
ever; i.e., interest rates affect real wealth, given a constant expected earnings stream 
The most important aspect of a rise in the variance of money growth is that the variance 
of expected earnings streams, especially returns to capital, rises. There is no capital in 
the Walsh model. Whether his results hold in this broader context is not clear but is not 
the central problem of this comment. 

3 Evans offers no justification for the use of the standard deviation of monthly 
changes. A more extensive version ofthis comment (Tatom 1984) includes an examina¬ 

tion of four other variability measures including the standard deviation of the level of 
the interest rate, the logarithm of the rate, and the 5-year standard deviation of the 
annual level of the interest rate. The results for the level and lag of the rate are similar 
to those found using Evans's measure. 



ioio 


JOURNAL OF POLITICAL ECONOMY 



is plotted along with the volatility of quarterly money growth (stan¬ 
dard deviation of quarterly money growth for the 4 quarters of 
money growth in each year), VM*. The actual money growth rate is 
used since there is little difference between it and unanticipated 
money growth (see, e.g., Sheehey 1984). Note that, as in Evans’s 
figure 1 (p. 21 1), the recent volatility of annual money growth is not 
unusual, while that for quarterly money growth is dramatically higher 
in 1980-82, much like Evans’s volatility ol interest rates. This alterna¬ 
tive measure is taken to be more representative of the money growth 
variability that matters, that is, the short-run variability of money 
growth that has transitory effects on spending, output, and employ¬ 
ment (see, e.g., Poole 1975; Friedman 1982a, 19826; Melt/.er 1982). 


Anticipated versus Unanticipated Interest Rate 
Volatility 

Evans’s tests for money growth and interest rate volatility effects are 
conducted using a variant of Barro’s (1978, 1981) model for real 
output estimated over the periods 1947-78 and 1948—81. ' The real 


1 Evans provides no theoretical basis for the effect ot interest rate volatility on output 
other than to suggest that the Modigliani-Miller theorem (1958) applies to predictable 
changes in interest rate variability and that such changes, then, could not atteci the 
private allocation of resources or output. This ignores the fact that these changes affect 
market assessments of risk generally and hence affect the after-tax real rates ol return, 
the desired capital stock, price level, and output. 
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output (In X) equation contains a constant, trend, and transitory 
movements in federal expenditures (FEDV ), a lagged dependent vari¬ 
able, and unanticipated money growth ( DMR ). The variable DMR is 
the residual from an equation explaining money growth (Ml) in 
terms of lagged money growth, FEDV„ and the past 2 years’ data on 
the unemployment rate. 

Evans derives unanticipated interest rate volatility from an equa¬ 
tion relating the logarithm of VR, to a constant, time trend, and two 
lagged values of itself. The residual from such an equation is the 
unanticipated component, EVR,. Such an equation for the period 
1947—78 is given at the top of table 1 /’ The second lagged dependent 
variable and the trend are not significant at conventional levels; when 
the lagged dependent variable is omitted, the trend remains insignifi¬ 
cant (t = 1.60). The same result occurs in the longer period. The 
second specification in table 1 is also used in the tests below to de¬ 
lineate anticipated/unanticipated movements in interest rate volatility 
and their output effects. 

When the interest rate variability equation at the top of table 1, the 
money growth equation, and the In X, equation are jointly estimated, 
Evans finds that EVR, - i is significantly negative in the In X, equation; 
the contemporaneous EVR and lagged anticipated component of In 
VR are not. He concludes that unanticipated interest rate volatility 1 
year earlier matters. This is a curious use of the notion of an unantic¬ 
ipated change. Typically (e.g., in the case of money growth), unantic¬ 
ipated changes in variables that influence decisions affect those deci¬ 
sions immediately, then, due to information or other adjustment 
costs, with a lag. In this case, the “unanticipated” information does 
not affect any decision until it is fully known and incorporated in 
current anticipations in the next year. 

Table 2 summarizes the (-statistics for the elfects of adding contem¬ 
poraneous or lagged information on interest rate variability to the 
real output equation for 1948-78 ((-statistics for the equations es¬ 
timated through 1981 are in parentheses). The /-statistic provides 
information on the sign of the effect as well as its significance. The 
results in table 2 support the central thrust of Evans’s hypothesis, but 
a qualification appears to be in order since the current anticipated 


' T he results here are estimated using ordinary least squares (OLS). Evans uses the 
lull-information maximum-likelihood technique, following Barro (1981. pp 150-59). 
Ordinary-leasl-squares estimates are used here because they are sufficient to demon¬ 
strate the issues raised and because Kmenta (1971, pp. 174-82) and Amemiya (1977) 
have noted the sensitivity of maximum-likelihood estimators to specification errors. 
Admittedly, however, this choice can be problematical. Ordinary-least-squares esti¬ 
mates are not generally consistent if the true model is a simultaneous system; variable 
selection results based on OLS methods may be inferior. 
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TABLE 1 

Interest Rate Variability Equations ( 1947-78) 

1 In VR, = - 1.9352 + .0257/ + .7042 In VR - .2526 In VR,.., + EVR,. 

(.6308) (.0126) (.1788) (.1781) 

ft J - 5020. S E. = .5244. 1>-W = 2.02 

2 In VR, = - 8366 + .6943 In VR, , + EVR,, 

(.3897) (.1330) 

/(-' = 4583, SE = 5474. D-W = 1.78 

Non St.unlaid mots arc indicated in pjreiuhcscs 

variability measure has the same significant negative effect as the 
lagged unanticipated measure.*’ 

T he addition of EVR, _ , to equations containing In VR, results in the 
four equation estimates shown in table 3 for the two periods. The top 
two equations use Evans's In 'tfh, specification, while the bottom two 
are for the simpler In tfh, specification in table 1. The addition of 
information on the lagged unanticipated volatility does not add statis¬ 
tically significant information to the real output equation containing 
this period’s predicted volatility. In both periods, one can reject the 
hypothesis that past unanticipated interest rate volatility affects real 
GNP when the contemporaneous anticipated volatility is taken into 
account (and conversely). T hus Evans's expectations hypothesis does 
not appear to be supported by the data. 


Does the Volatility of Money Growth Matter? 

As noted earlier (fig. 1), quarterly money growth volatility increased 
dramatically in 1980 and 1982. The preferred (second) interest rate 

TABLE 2 

/-Si A IIM It S FOR THE EeeF.CT OF INTEREST Rate VARIABILITY ON OUTPUT (1948-78) 


Actual Predicted EVR 


1 In VR, (eq 1) 

In VR, i (eq )) 

2 In VR, (eq 2) 

In VR, , (eq 2) 


-1.38 (-2 64)* 
-2 42* (-3.05)* 

- I 38 (-2 64)* 

- 2 42* (- 3.05)* 


-2.72* (-3.61)* 
-.51 (-.48) 

-2 42* (-3.05)* 
- 76 (-.59) 


.34 (- 42) 

-3.22* (-4.28)* 

.56 (-.05) 

-2.60* (-3.40)* 


Non—Miami Its lor 1948-H1 in parentheses 

* (.lineal MiaiMlH (5 percent ugm/ir ante) is 2 06 for ihe period 1948-78 and 2 95 for die period 1948-8J 


*’ 1 he (-statistics do not indicate the magnitude of differences in the fit of the equa¬ 
tions. however. Using the predicted In VR, in either period instead of EVR,. , raises the 
standard error ol the In X equation of about 0.016 by less than 5 percent for the Evans 
specification in both periods and the 0.017 standard error found using the EVR 
from the simpler In VR specification by 1.4 percent in the earlier period and 3 percent 
in the longer period. 
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TABLE S 

Tests of Evans's Anticipated versus Unanticipated Variability Effects 


In VR, = f{t. In VR,_ „ In VR,^) 


1948-78: In X, = 2.125 + .01091 + .667 inX,_, + I.I042DMR, 
(2.93) (2.32) (5.52) (4.18) 

+ .5013DMR, _ 1 + .0118 In VR, — .03I2£VR,_,. 
(1.61) (.48) (-1.56) 

R 2 = .9971, S.E. = .0174, D-W = 1 94; 

1948-81: In X, = 1.989 + .01031 + .6861nX,_, + 1 0710DMR, 

(3.10) (2.51) (6.45) ^ (4 52) 

+ 04I0DMR,-, + .0037 In VR, - .0281£VR,_ 
(1.46) (.23) (-1.88) 

R 2 = .9979, S.E. = .0160, I)-W = 1.92 


In VR, = /(In VR,. ,) 


1948-78: In X, = 2.392 + .01381 + .612 in X,-, + .9969DMR, 
(3.06) (3.02) (4.76) ^ (3.76) 

+ .6897DMS,_, - .0085 In VR, - 0120£VR,_,. 
(2.27) (-.77) (- 1.14) 

R 2 = 9971. S.E. = .0174, D-W = 1.94; 

1948-81: InX, = 2.2152 + .01271 + .6414lnX,_, + 1.0365 DMR, 
(3.18) (3.14) (5.63) ^ (4.07) 

+ .5879 DMR, , - .0082 In VR, - ,0136£VR, ,, 
(2.11) (-1.14) (-173) 

R 2 = .9980, S.E. = .0169, D-W = 1.89 


Note —(Matmics are Riven in parentheses 


variability equation in table 1 was reestimated adding the variability of 
money growth (In VM*), to examine whether the recent increase in 
variability of interest rates was related to the increased variability of 
money growth. The results are given in table 4 for both time periods. 
The equations show a significant positive contemporaneous relation 
with money growth volatility over the period 1947-81/ 

This equation suggests that there is a link between Federal Reserve 
actions and the deleterious effects of interest rate variability.” Policy 
actions that lower the variability of money growth apparently can 


7 The absence of a significant effect in the 1947-78 period is not surprising since 
VM* varied little over that period. The standard deviation of VM* over the period was 
only 0.825, while its mean level was 1.740. Evans has noted the instability of his In VR 
equation after 1978. The out-of-sample errors of the top equation in table 1 tor 1979- 
82 average 0.951, almost twice (1.84) the in-sample standard error, all the errors are 
positive, and the errors in 1980—81 are each more than twice the in-sample standard 
error. 

8 The possibility of reverse causation was examined by regressing In VM? on lagged 
levels of the logarithm of the interest rate variability measure. No significant lagged 
effects were found during either period. 
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TABLE 4 

The Variability of Money Growth and Interim Rails 


1947-78 In VK, = - 1 0345 + 6574 In V/<,- , + .2224 In VM*. 
(-2.51) (4 90) (1.35) 

K* = 47. S.E = .5401. D-W = 1.89, 

1947-81: In VK, = -.7542 + 7564 In VK, ,+ 8441 In VMf, 
(-2.07) (6 42) (2 22) 

n- = .64, S.E = .5572. O-W = 1.89 


Null —r-MiitiMMY air given in |Micn(hivs 


reduce interest rate volatility/' Othei sources of the interest rate vari¬ 
ability are presumably beyond the control of the Federal Reserve, but 
policy actions to control the growth rate of money, say, at least as well 
as before 1979, can reduce interest rate variability and real GNP 
losses associated with it. 

For each period, anticipated and unanticipated interest rate volatil¬ 
ity from the equations in table 4 were added to the real GNP equation 
(table 5). For both periods, the current anticipated volatility measure 
marginally outperforms the lagged unanticipated measure. Taking 
money growth volatility into account in the forecast of In VR, 
strengthens the result that it is anticipated volatility that matters. 
When both AT’/?,.., and In (7?, are entered into the output equation 
(not shown), neither is significant in the early period. Over the longer 
period, however. In )t , has a significant negative coefficient, while 
KVH, i is insignificant. 10 Thus neither the anticipated nor the un¬ 
anticipated interest rate volatility hypotheses can he rejected in the 
early period, but the results reject the unanticipated volatility hy¬ 
pothesis in favor of an anticipated volatility hypothesis when data for 
1979-81 are included. 


Money Growth and Interest Rate Variability: 

Supply or Demand Shock? 

Evans’s analysis of the effect of risk on output and previous work on 
monetary variability by Gertler and Grinds (1982) and Mascaro and 
Melt/er (1983) suggest that risk affects aggregate demand. In Gertler 
and Grinds (1982), increased variability of money growth reduces 
investment, while in Mascaro and Meltzer (1983), the aggregate de- 


'* T ests of whether VTVf* directly aflecied real output in either 1948—78 or 1948-81 
indicate that it did not, as Evans found lor unanticipated annual money growth vari¬ 
ability. ^ 

10 The coefficient on In VK, is unaffected: it is —0.020 (t = —2.18), while that on 
EVR, , is -0.009 </ = -0 90). 





Constant 

2.528 

2.080 

2.607 

2 036 


(4.10) 

(3.06) 

(4.56) 

(3.20) 

Trend 

,016 

Oil 

.016 

.01 1 


(4.48) 

(2.82) 

(4.93) 

(2 92) 

Ln X, , 

.578 

.669 

.567 

.676 


(5.80) 

(6.04) 

(6.11) 

(6.54) 

DMR, 

.915 

.998 

.919 

1.027 


(3.81) 

(3.86) 

(4 04) 

(4.08) 

DMR,., 

1.045 

.711 

1.062 

.681 

rs 

(4.37) 

(2.75) 

(4.61) 

(2.67) 

Ln VR, 

-.030 


-.027 



(-3.76) 


(-4.51) 


EVR, _ ,* 


- .024 


- 025 



(-3.34) 


(-3.78) 

R J 

.998 

998 

998 

.998 

S E. 

.0163 

.0170 

0157 

.0168 

D-W 

1.92 

1.78 

1 91 

1.79 


Note —Mlaimu* aie given in parentheses 

* EVR ( - | is the lagged residual from the inieiest rate volatility equation 


mand reductions arise through an increase in money demand. If a 
rise in money growth volatility and/or interest rate volatility results in 
a reduction in aggregate demand, however, one would expect to ob¬ 
serve a decline in prices (assuming the supply curve is not horizontal, 
in which case no price response would be expected). Increases in 
perceived risks of future returns, however, have effects on producers. 
Indeed, reductions in investment represent reductions in demand for 
capital inputs that in the short run are achieved by reduced utilization 
of capital, given expected prices of inputs and outputs. 1 Whether 
supply is reduced more than demand at unchanged prices is essen¬ 
tially an empirical question. 

To lest the price level effect, VR was entered into the Barro price 
equation (1981, pp.- 159-67) for 1952-78 and for 1952-81. 12 The 
price level equation is derived from a money demand function where 
r,_ |, the lagged Aaa yield, is used as an instrument for the current 


11 A reduction in desired capacity due to an increase in the variability ot demand is a 
result that has been obtained by Sandmo (1971) and Holthausen (1976) for risk-averse 
firms. DeVany and Saving (1983) provide a model of the hrm in which a rise in 
variability of demand leads to a rise in price and a decline in expected output relative to 
capacity. In their model capital is diverted, in a sense, to providing insurance against 
variability so that output per unit of capital shrinks. 

The price equation is estimated from 1952 to avoid reliance on the Friedman- 
Schwartz (1963) data but to allow for the long lags on DMR. 
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quarter yield and G„ federal government purchases, is included along 
with current and five lagged values of unanticipated money growth 
and a trend. The equation is also modified to include another supply 
factor, the relative price of energy. 13 Also, the variability measures are 
used rather than their logarithm as above because in both periods this 
specification generally yielded superior results. Using the log of the 
variables does not alter the conclusions. No attempt has been made 
here to test whether it is anticipated or unanticipated volatility that 
matters. 

The results are given in table 6. Over both periods, a rise in risk is 
indicated to have a significant positive effect on the price level. While 
a rise m volatility may depress aggregate demand, the dominant ef¬ 
fect is on aggregate supply, so that increased risk raises the price 
level." The hypothesis that increased money growth volatility affects 
ju ices is also examined in table 6. In both periods, VM* has positive 
and significant effects on prices when the interest rate variability mea¬ 
sure is excluded. When both VMf and lagged interest rate variability 
are included (not shown), VMf is not significant in the early period (t 
= 1.96), but both variables are significant in the longer period. 15 In 
i he 1952-78 jjeriod, money growth volatility, which changed little, 
had no appreciable effect on prices once lagged interest rate volatility 
is taken into account. Thus the positive impact of lagged interest rate 
volatility is robust in the Barro model, but that for money growth 
volatility is robust only if interest rate volatility is ignored. 11 ’ 


1:1 I he energy price measure is the producer pi ice index for fuel power and related 
products deflated by the implicit price deflator for business-sector output. The price 
effeci of eneigy price changes arises from a reduction in productivity or potential 
output, oi natural output in Barro's model. See.e.g., Rasche and Tatom (1981)and the 
references there. When the relative price of energy is included in the output equation 
above, it is not significant (I < 1.5), although the steady-state effect (In X, = In X,. i) is 
significantly negative The inclusion of energy prices also lias no effect on the real out¬ 
put results lepoiled above, so it was arbitrarily omitted. 

" Another extension of the test of the effects of volatility would be to add the risk 
measure to Barro’s unemployment equation. Attempts to do this, however, failed as no 
measure was statistically significant in 1947-78 or 1947-81. The absence of a signifi¬ 
cant effect is not a serious problem, however, since the equation performs relatively 
poorly anyway and is likely senously misspecified. For example, it ignores any secular 
rise in the natural rate of unemployment and shows permanent effects of federal 
purchases on the unemployment rate. 

'' In the 1952-81 period, VR,^ , is marginally insignificant (( = 2.07; critical value = 
2.11) when VMf is included; VMf remains signihcant (t = 2.95). Thus there is a 
tompleie reversal where only VR,_ , matters for 1952-78 but only VMf is significant 
over the longer period 

The evidence suggests that both interest rate and money growth volatility reduce 
the demand for real money balances. The latter appears counter to the Mascaro- 
Meltzer (1983) hypothesis, which emphasized a positive money demand effect of risk 
but ignored such supply-side considerations. Their hypothesis is not directly tested 
here since the estimated equations are quasi-reduced-form expressions for prices, 
where output is allowed to vary. The results suggest only that, net of other changes and 
given interest rates, a rise in money growth variability reduces money demand. 
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TABLE 6 

The Effect of Interest Rate Variability on Prices (1952-78; 1952-81) 


Independent 

Variable 1952-78 1952-81 


Constant 

-.974 

- 1.103 

-.912 

- 1.107 


(-9.79) 

(- 9 56) 

(-7 74) 

(-10.81) 

DMR, 

-.928 

-.768 

- .854 

- 751 


(-6.76) 

(-5.20) 

(-5 43) 

(-5.32) 

DMR ,. 1 

-.881 

-.996 

- 852 

- 984 


(-6.33) 

(-6.31) 

(-5.40) 

(-6 74) 

DMR^-i 

-.857 

- 1 080 

-.815 

- 1.102 


(-5.88) 

(-6 94) 

(-5 34) 

( - 7 86) 

DMR 

- 1.354 

- 1 424 

- 1 199 

- 1 455 


(-9.36) 

(-8.23) 

(-7.32) 

( -9 02) 

OMR,-* 

- 1.275 

- !. 134 

- 1 264 

- 1 160 


(-9.37) 

(-7.45) 

(-8.22) 

(-8 10) 

DMR,- b 

- .585 

- .565 

- .681 

- 608 


(-4.49) 

(-3.72) 

(-4.52) 

(-4 29) 

1 

- .008 

-Oil 

- .008 

- 012 


(-6.67) 

(- 10.46) 

(— 6.55) 

(-13.45) 

Ln G, 

.051 

.076 

038 

077 


(2.39) 

(3 08) 

(1.50) 

(3 47) 

R,-i 

846 

2 389 

1.206 

2 656 


(1 58) 

(5 69) 

(2.19) 

(7.62) 

Ln p', 

.051 

041 

044 

.040 


(4.95) 

(3.82) 

(5 09) 

(4.96) 

VR, , 

.198 


090 



(3 45) 


(2.82) 


VM* 


.005 


.005 



(2.40) 


(3 65) 

R 1 

.98 

98 

97 

.98 

S.F.. 

.0069 

.0078 

.0084 

0076 

D-W 

2.02 

2.03 

1 85 

2 01 

NoTF -/-MdttSlK S 

are jpven m parentheses 

Dependent \ .triable is (In P 

- In M I 



Summary 

Evans’s postulated mechanism for his observed adverse effect of in¬ 
terest rate variability on output is open to serious doubts. Whether 
this eff ect occurs because of a current rational anticipation of variabil¬ 
ity or because of a past unanticipated volatility effect is not clear. In 
principle, an increase in risk should affect economic decisions only 
when it is known or anticipated to occur. The results here do not 
discriminate clearly between the anticipated/unanticipated hypoth¬ 
eses but favor the expectations effect when such expectations allow 
for money growth variability. The analysis also clarifies the mecha¬ 
nism through which risk affects output. Instead of simply affecting 
aggregate demand, it appears that risk has a dominant adverse effect 
on aggregate supply and therefore is positively related to the price 
level. Finally, the evidence presented suggests that policymakers have 
reason for concern about interest rate volatility. Recent increases in 
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the variability of money growth have been associated with greater 
interest rate variability. Policymakers may be able to contribute to 
higher levels of real GNPby reducing risk through steadier control of 
monetary aggregate growth. 
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Official Intervention in the Foreign Exchange 
Market 

Peter D. Spencer 

Nuffield College, Oxford University 


In an article in this Journal, Dean Taylor (1982) proposes a simple 
statistical test of the significance of central bank losses in the exchange 
market: Would random intervention be likely to result in as large a 
loss? This comment shows that, as it stands, his test statistic does not 
allow properly for the effect of the mean level of intervention. There 
is in fact an element of ambiguity in the question Taylor poses, and 
this is illustrated with examples taken from his empirical work. The 
connection between the Taylor test and regression analysis is noted 
and explored. 


The Taylor Criterion 

Taylor calculates ex post profits by multiplying the purchase of 
foreign exchange made in each month by the percentage change in 
the average dollar exchange rate between that month and the end of 
the observation period: 


$ total 


7 




r 



n £' = n'g. 


( 1 ) 


where g, = 1 — ( e,/e T ), n, = intervention Row ($) in month t. n' = (n, 
to, . . . nr), r, — domestic currency cost of dollars, g, = percentage 
capital gain from period i to period T, g' = (g\ g? ... gr), and T = total 
number of periods. He develops a test of their statistical significance 
by asking whether random intervention with zero mean and the same 
variance as actual intervention would be likely to generate such losses. 


This paper was written while 1 was a Research Fellow at Nuffield College, Oxford 
University. 1 am grateful to David Hendry, Charlie Bean, Andrew Rose, and a releree 
lor helpful comments and to Dean Taylor for help with the data. 
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VY o 0*1). the variance of profit may th en k 
Hr notes th.it. it " ~ A<1 '' 
represented by 

varfn'g) = «) 

,, Ilrf uses r Jit* V.I. iant <* ofartua/ intervention as an estimate of a 2 : 

&- = (n - ni)’(n - rti) 



i i / 


( 3 ) 


where I represents a 7 x 7' identity matrix, i is a 7 x J summation 
vector, with elements equal to unity, and « is the mean value of n. 
Dividing the actual value of profit (1) by its estimated standard devia¬ 
tion gives the Taylor test statistic: 


Jo — 


>±£ 

v(g'g) Uj ' 


( 4 ) 


Taylor asserts that this statistic has a student-/-distribution and pro¬ 
ceeds to conduct a series of one-tailed /-tests. For this to be the case it 
is necessary that the numerator and denominator terms in (4) be 
independent, which in this case requires (Kendall and Stuart 1969, p. 
:?r>9) 



This holds only if the elements of g are zero or constant. Since this is 
not the case empirically, the Taylor statistic does not have an exact l- 
distribution. 

’This problem arises because the intervention series used to calcu¬ 
late the numerator (profit) will not in general net out to zero, whereas 
the denominator (estimated variance) is constructed under the as¬ 
sumption that it does. The consequence of this misspecification thus 
depends on the magnitude of the mean intervention flow. If this is 
large (as, e.g., in the cases of Germany and Spain during the 1970s), 
this statistic will give a misleading indication, overestimating the statis¬ 
tical significance of profit or loss. If it is small (as in the case of the 
United Kingdom), the bias is likely to be negligible. 


Alternative Criteria 

The Taylor statistic can readily be modified to avoid this problem by 
treating the numerator and denominator terms symmetrically in or¬ 
der to obtain a statistic with an exact /-distribution. The first option is 
to estimate <r 2 in (2) by the absolute variance of intervention, as against 
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the variance about the mean. This criterion offers a check on the 
overall intervention loss, implicitly using the terminal value of the 
exchange rate to value the overall accumulation of reserves during 
the estimation period. It tells us whether random intervention, with 
the same mean and variance as actual intervention, would have been 
likely to have resulted in losses as great as those actually incurred. The 
second option is to work throughout with deviations of intervention 
flows about the mean. This is equivalent to valuing the overall ac¬ 
cumulation of reserves at the average exchange rate prevailing over 
the period. 

This latter statistic tells us whether it is likely that the difference 
between the loss due to intervention and that implied by a policy of 
constant intervention (with the same mean) could have been 
generated by a policy of random intervention (with the same vari¬ 
ance). This statistic is interesting because it allows us to distinguish 
between the cost of the overall level of intervention during any period 
(which may be dictated by monetary, balance of payments, or other 
considerations) and month-to-inonth variations in its level. The first 
of these statistics may be represented as 


t\ — 


n g 


l g'fZ(n'nlT)\'' 


( 6 ) 


The properties of this criterion can be seen immediately by noting its 
similarity to the conventional (-statistic that would be obtained by 
regressing rt on g (or vice versa) and suppressing the intercept 
coefficient. If we denote the coefficient in this regression by fl, its 
estimate by b, the residual error variance by a?,, and the (-statistic by t{ 
we have 

n = Pg + v 


b = 


var (b) = 


JLK. 

g'g 

g'g 


and 


where 


, n g 
t = -^ 


« = 






rt t gg \ 

nil -,— n 

2 _ \ ggj 


( 7 ) 


T - 1 



toss 


or ItfcltlCAt tco 

The only difference between t, and /{ «thattr^ia °*° MV 

null hypothesis (n> * Oorfi - 0) in the latter case Underthe 


*»**; Consequent 


recognize t, as the ^grange multiplier (LM. or efficient sca!*)^ * e 
associated with this hypothesis and /,' as the Wald statistic (Beilidt^'j 
Savin 1077). Assuming that dnd 


both statistics 


r ~ AYO. tfl), 

/•distribution with T and T - 


( 8 ) 

ln\r a /-distmuuon «im / .mu / - / degrees of 
„,p«-l'i»el> 1 These sla(,s(ies are asymptotically equivalera, 
„eed L dm,sc he,ween .hem m large samples 


Working with deviations 


of the intervention How from its mean 


(re presen 
statistic. 


ted bv the vector n) gives the LM variant of the second 
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which has the regression or Wald analogue 
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( 10 ) 


where 


■2 = 


n'(l- 


KK 

YJ 


n 


T 


These have T - 1 and T - 2 degrees of freedom, respectively. 
Obviously (/«j) will be close to t t (t{) either if “closure’’ occurs and the 
mean level of intervention is negligible or if the terminal value of the 
exchange rate equals its average value. Differences between these 
statistics indicate the sensitivity of the calculations to changes in the 
average level of intervention and the exchange rate used to close the 
books. These tests are symmetric in that the same /-statistic emerges 
no matter w hether n or g is considered to be the dependent variable. 


The Regression Framework 

Once the problem is recast within a regression framework several 
advantages are apparent. The statistics described in the previous sec¬ 
tion can be computed readily using standard packages, and their 
distributional characteristics are relatively well understood. More¬ 
over, regression programs routinely provide important misspecifica- 
tion tests: for example, the Durbin-Watson statistic provides a robust 


The p statistic has an exact l-distribution only under the null. 
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check on the independence assumption incorporated in equation (8). 2 
’ structural stability can also be tested conveniently when the problem 
is set up in this way. Finally, the regression framework reveals the 
strong similarity between tests of the profitability of intervention and 
more general tests of market efficiency in which ex post exchange 
i gains (typically adjusted for interest rate differentials) are regressed 
<,n variables representing the market’s information set. 


Some Empirical Examples 

In order to illustrate some of the differences between these statistics, 
table 1 shows some typical results, obtained using the same data and 
estimation periods as in the Taylor study. 

These figures illustrate the well-known result (Berndt and Savin 
1977) that the Wald statistic (<(, /£) provides a more stringent test than 
the LM analogue (fi, < 2 ). The results for Spain are most interesting in 
this respect. Those for the United Kingdom show that the numerical 
differences between the and type measures can be quite notice¬ 
able even when the mean level of intervention is small. In the German 
case, these measures give a different qualitative indication, the /, type 
statistic indicating a loss and the < 2 type a gain. In other words, a 
private speculator who had countered the intervention policy would 
have made a profit over the period, but this would have been less than 
he would have made had he ignored inonih-to-month fluctuations 
and bought the deutsche mark at a steady rate. In this sense it would 
not have been profitable to bet against the central bank. 1 


Conclusion 

Dean Taylor has suggested a simple criterion for judging the 
profitability or otherwise of foreign exchange intervention by central 
banks: Would random intervention result in losses as large as those 
that actually occurred? However, this question is ambiguous unless 
closure occurs since there are two alternative ways of taking into 


2 Somewhat surprisingly, very tew of my results indicated dynamic misspecihcation. 
Were this to prove a problem an obvious modification would be to estimate the variance 
of intervention using the residuals from a regression of the n f on their own lagged 
values together with the g,. 

* In this particular case neither t\ nor t 2 is significant. However, it is easy to find cases 
in which these statistics are qualitatively different and one or both are significant. For 
example, in the case of the United Kingdom (1972:7-1976:12), = -3.68, but t 2 * 

1.96 When assessing the profitability of intervention it ts important to take account of 
net interest receipts since these naturally tend to offset exchange losses. In a separate 
paper I show that in the case of the United Kingdom (1972:7-1979:12) net interest 
receipts offset exchange losses almost exactly. 
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TABLE 1 

Alifrnative Staiisticae Indicators of the Significance of Exchange Losses 



Spain 

United 

Kingdom 

Germany 

bstJinulion period 

1974:2-1979.12 

1972:7-1979:12 

1973:6-1979:12 

Sample size 

Mean iniei vention 

69 

90 

79 

($ millions) 

80.7 

-137 

212.3 

r, 

- 3 600 

- 1.827 

-.763 

6 

-3 954 

- 1.873 

-.763 

r, 

-2 472 

- 1.936 

.913 

t'. 

-3.789 

- 1.978 

.913 


Son - \ negative rvalue in die ales rxchange toss 


account the mean level of intervention. Accordingly, Taylor's ques¬ 
tion can he asked in two different ways: (a) Was the overall loss due to 
intervention greater than would have been expected by a program of 
random intervention (with the same mean and variance), and ( b ) was 
this loss significantly larger than would have been the case had the 
overall amount of intervention occurred steadily over the period? 
These two approaches can give surprisingly different answers even 
when the mean level of intervention is relatively small. Moreover, it is 
necessary that both questions be answered in the affirmative for it to 
lie profitable for private speculators to bet against the central bank. If, 
for example, the answer to question a is yes but the answer to b is no, 
then he could do as well (or better) by building up his position at a 
steady rate, ignoring month-to-month fluctuations in the level of 
official intervention. 
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A Critical Appraisal of Hausman's Welfare 
Cost Estimates 

Edgar K. Browning 

Texas A&M University 


Economists have long been aware that a tax on labor income produces 
a welfare ccist by distorting the income-leisure choices of workers. 
Arnold Harberger was the first economist to attempt to actually esti¬ 
mate the size of this welfare cost. In a seminal paper (Harberger 
1964) he estimated the welfare cost of the federal individual income 
tax at about 2.5 percent of tax revenue in 1961. In 1976,1 corrected a 
minor error in Harberger’s analysis and, using a slightly higher as¬ 
sumed labor supply elasticity, estimated the welfare cost of all taxes on 
labor income at about 5 percent of tax revenue (Browning 1976). 

The magnitudes of these early estimates stand in sharp contrast to 
those reported in a series of recent, widely cited papers by Jerry 
Hausman (1981a, 198li, 1983a, 1983b). Hausman estimates the wel¬ 
fare cost of labor supply distortions for husbands to be 29 percent of 
tax revenue. If Hausman’s finding were correct, it would represent a 
major contribution to our knowledge and would also necessitate a 
reevaluation of the role of income taxes in the lax system. In this 
comment I discuss several reasons for being skeptical about Haus¬ 
man’s results. In contrast to published commentary on his work— 
which has focused on econometric issues (and has tended to be highly 
complimentary) 1 — I will emphasize theoretical issues and the plausi¬ 
bility of the underlying labor supply responses implied by his welfare 
cost estimates. 


I would like to thank William R. Johnson, Jonathan Skinner, and an anonymous 
referee for comments on an earlier draft. 

‘Of four published comments on Hausman’s work (Buskin 1981; Burtless 1981; 
Perloff 1981; Heckman 1983), only Heckman’s is particularly critical. Ln view of the 
implications of Hausman's econometric work discussed in this paper. 1 think Heckman 
had good reason to be critical of the econometrics underlying Hausman's wellare cost 
estimates. 
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I. Theory 

In calculating welfare costs for various taxpayers, Hausman attempts 
to estimate the monetary compensation required to offset the utility 
difference between the present tax system and a lump-sum tax of 
equal yield. However, he does this for only two taxes: the federal 
individual income tax and the employee portion of the social security 
payroll tax. The procedure Hausman uses would be theoretically cor¬ 
rect if these were the only two taxes falling on labor income. But they 
are not: the employer portion of the social security payroll tax, state 
and local income taxes, and sales and excise taxes also distort the 
income-leisure choice. 2 

As I have pointed out elsewhere (1976, p. ‘290), when several taxes 
interact to distort labor supply decisions, it is impossible to identify 
part of the total welfare cost as “the” welfare cost attributable to a 
subset of these taxes, although Hausman purports to do just this. 
Figure 1 tan be used to illustrate this and several later points. An 
individual’s compensated labor supply curve is shown as S*. Let there 
be one or more indirect taxes that act to reduce the wage received by 
the individual at a combined marginal tax rate of m,\ thus, the wage 
net of these taxes is uq. In addition, let there be one or more direct 
taxes levied on wage income after it is received at a combined mar¬ 
ginal rate of m,y. (Note that the direct taxes are levied on uq, not w.) As 
a result the individual conf ronts a net marginal wage rate of w 2 and is 
in equilibrium at point A with a labor supply of L*. 

The total welfare cost of these taxes is clearly area ABC, but what is 
“the” welfare cost of one of these taxes alone? In fact, it is not possible 
to attribute part of area ABC to any particular tax or subset of taxes. 
For example, we might evaluate the welfare effects of substituting a 
lump-sum tax for the direct taxes (this is what Hausman does), which 
increases the net marginal wage rale from w~> to uq. The welf are effect 
of this change is area BDEA, suggesting that area DCE is “the” welfare 
cost of the indirect taxes. This, however, is clearly an arbitrary way to 
proceed. We could just as well first substitute a lump-sum tax for a 
different subset of taxes (such as the indirect taxes) so that the wage 
rate rises from w-> to aq. This would then suggest that area BDEA is 
“the" welfare cost of the indirect taxes and area DCE is “the” welfare 
cost of the direct taxes. 3 

* It is important lo recogmce that sales and excise taxes afiect labor supply decisions 
regardless of whether they are shifted backwartj or forward. If they depress lacloi 
prices (including wage rales) without raising product prices, the effect is obvious. But 
even il they raise product prices while leaving nominal factor prices unchanged, the 
result is still a redutiion in the real wage rate. 

' I o make this point as clear as possible, fig 1 is drawn with m, = m, t . This, however, 
is not crucial to the result The basic point derives from the fact that any specified 
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Of course, proceeding in either way is arbitrary. While we ran 
unambiguously identify area ABC as the total welfare cost of all taxes 
on labor income, we cannot partition the area into parts that are 
uniquely associated with particular taxes. Nonetheless, we can evalu¬ 
ate the marginal welfare effects of a specified change in the tax sys¬ 
tem, such as the substitution of a lump-sum tax for the direct taxes. 
On this interpretation, Hausman is actually estimating the marginal 
welfare cost of a specific reduction in the effective marginal lax rate 
on labor income. While the set of taxes investigated in this way is 
arbitrary, the results are still of interest. This procedure does, how¬ 
ever, produce estimated welfare costs that are larger relative to the 
tax revenue of the specified taxes than is the total welfare cost relative 
to revenue of all taxes. This is one reason why Hausman's results, 
when expressed as a fraction of tax revenue from the specified taxes, 
can be expected to be large compared with those of other inves¬ 
tigators who have focused on the total welfare cost. 

Thus, I interpret Hausman’s research as an attempt to estimate the 
change in the welfare cost produced by substituting a lump-sum tax 
for the individual income tax and employee social security tax. Haus¬ 
man’s method of estimating this change in welfare cost is, however, 
incorrect. In terms of figure 1, his estimate is area FEA, (Hausman 
himself is nfot very clear on how he proceeded, but a careful reading 


reduction in the effective marginal tax rate can be accomplished by combining rate 
reductions of different taxes in numerous wavs 
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other words, if a person paid $1,000 in personal income and 
employee social security taxes, Hausrnan simulated the results of sub- 
stiuitmg a lump-sum tax of $1,000 for these taxes. In terms of figure 
I, this implies an increase in labor supply from L$ to f- 2 , and the 
individual is better off by area FEA. The increase in labor supply, 
however, also generates additional revenues from the indirect taxes, 
as shown by area BDEF. These are as much a social gain as the gain 
that accrues to the individual himself.Consequently, the change in 
the welfare cost from adding the direct taxes to the indirect taxes 
(conversely, the welfare gain from substituting a lump-sum tax for the 
direct taxes) is given by area BDEA. 

Hausrnan makes an elementary theoretical error by estimating the 
welfare cost as area FEA rather than as area BDEA. Given the labor 
supply responses implied by his empirical work, the relevant welfare 
cost of the assumed tax change is substantially larger than he esti¬ 
mates. Note that sales-excise and state income taxes were 5 and 1.7 
percent of net national product, respectively, in 1975 (the year Haus- 
tnan's data refer to), and the employer portion of the social security 
payroll tax was 5.85 percent. For workers under the social security 
ceiling on taxable earnings, this suggests that the marginal tax rate of 
taxes ignored by Hausrnan (m, in fig. 1) was about 12 percent, while 
the marginal rale for workers above the social security ceiling (recog¬ 
nizing the progressivity of state income taxes) was about 8 percent. 

To illustrate the magnitude of the error produced by this oversight, 
let m, = 0.1 and m, t = 0.3. Under these fairly typical conditions, if the 
compensated supply curve is linear, area BDEF is 74 percent as large 
as area FEA , suggesting that the true welfare cost of the specified tax 
change would be on the order of 74 percent higher than the estimates 
Hausrnan reports. Since Hausman’s estimate is that the welfare cost is 
29 percent of tax revenue, the figure corrected for this error would be 
about 50 percent. 

This correction of Hausman’s error therefore makes his results all 
the more impressive. 1 am convinced, however, that his approach 
greatly overstates the true welfare cost for reasons explained below. 


1 "Tile deadweight loss calculations replace only federal taxes with lump-sum taxes 
and leave state income taxes in place” (Hausrnan 1981a. p. 53). It is clear from this and 
other internal evidence that Hausman’s estimates are made as I describe them here. 

’’ Putting this point differently, a lump-sum tax that would keep total government 
revenue unchanged would be smaller than $1,000 by area BDEF. 
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II. Applying the Theory 

Two factors are of overriding importance in determining the size of 
the welfare cost due to labor supply effects of a tax on labor income. 
One is the effective marginal tax rate, which can usually be estimated 
with a reasonable margin of error. tt The second is the compensated 
labor supply response when a lump-sum tax is substituted for the tax 
in question, the change from L$ to jL 2 in figure 1. This second factor 
has been the subject of several empirical studies, but Hausman uses 
the results of his own econometric work in developing his estimates. 

In the papers I have read, Hausman at no point informs the reader 
of what compensated labor supply responses are implied by his wel¬ 
fare cost estimates. Hausman does compare labor supply under the 
present income and payroll taxes with no taxes at all, with a propor¬ 
tional tax, and with reduced tax rates. For instance, he reports that 
married prime-age males work 8 percent less under the present tax 
system than they would with no taxes at all. All of these comparisons, 
however, involve changes where there are large income and substitu¬ 
tion effects that tend to offset one another to a substantial degree.' 
What matters for the welfare cost estimates, however, is the compen¬ 
sated labor supply effects reflecting the substitution effects alone. 

I propose here to infer the compensated labor supply responses 
that are implied by three examples Hausman describes in sufficient 
detail to make such an inference possible. Although this turns out to 
be exceedingly simple to do, the results are edifying. My contention is 
that tl\e labor supply responses are, in at least two of the three cases, 
so large that they can immediately be seen to far exceed what is 
plausible. 

My analysis is based on the fact that if we assume that the compen¬ 
sated labor supply curve is linear over the relevant range, there is a 
very simple relationship between Hausman’s (incorrect) welfare cost 
estimates and the compensated changes in labor supply that underlie 
these estimates. In terms of figure 1, Hausman’s estimates (area FEA) 


fi It can be argued, however, that Hausman’s use of nominal statutory tax tales 
overstates the true marginal tax rales (Heckman 1983). 11 an additional $100 in earn¬ 
ings results in an additional $20 in deductions and/or exclusions, the statutory rate 
applies only to $80, and the rate relative to the $100 real increment in earnings would 
be 25 percent less than the statutory rale. 

7 Actually, the aforementioned comparisons are highly suspect on theoretical 
grounds. For instance, in Hausman’s analysis the elimination ol the income and payroll 
taxes has a large income effect depressing labor supply. Assuming such a comparison is 
meaningful, it can certainly be objected that in an appropriate analysis there can be no 
income effect since government expenditures must fall by a magnitude similar to the 
tax reduction. 



lOJO JOURNAL OF POLITICAL ECONOMY 

of the welfare cost, W, can be expressed as 

W = l /zmjW\\L, (1) 

or the welfare cost equals one-half m,/ times the change in earnings 
(evaluated using w\) that results when a lump-sum tax replaces the 
income and payroll taxes and is of a size that keeps the worker on his 
initial indifference curve. If the compensated labor supply curve is 
not linear, this is only an approximation. However, applying this pro¬ 
cedure to the numerical example worked out by Burtless (1981) in his 
comment, I estimate a compensated change in earnings that differs by 
only 2.5 percent from that reported by Burtless using Hausman's 
technique.* Thus. 1 do not think that equation (1) will misrepresent 
the behavioral responses underlying Hausman's results. 

t able 1 gives the necessary data for three representative individual 
taxpayers described by Hausman (1981«). Cases A and B are prime- 
age married males, one with a wage equal to the mean in his sample 
(fib. 18) and the other with a wage approximately equal to the mean in 
the top quintile of male earners ($10.00). Case C is a married female 
with a wage slightly below the mean wage of working wives and whose 
husband earns $10,000 (all figures for 1975). Hausman states that his 
lull-sample results resemble these three illustrations, so it is unlikely 
that we will get a misleading impression by concentrating on these 
examples. 

Applying equation (1), the data in table 1 imply that earnings (and 
presumably labot supply) will increase by 15.4. 08.4, and 89.2 per¬ 
cent, respectively, for the compensated change underlying the wel¬ 
fare cost estimates. Since all three workers are full-time workers, it 
may he worthwhile to translate these results into what they imply if 
each worker is initially working a 40-hour week. These results then 
indicate that the workweek will become 46.2, 67.4, and 75.7 hours. 
The implied compensated labor supply elasticities are 0.4, 1.76, and 
1.74, all evaluated at the net wage rate (point A in fig. 1). 

1 believe that introspection and common sense alone are enough to 
allow us to conclude that these behavioral responses are larger than is 
plausible, certainly for cases B and C if not for case A. A helpful 
mental experiment is possible when we recognize that the stimulus 
that gives rise to these labor supply responses is almost identical to 
overtime pay For example, substituting a lump-sum tax for the in¬ 
come and payroll tax in case B changes the relevant portion of the 
budget constraint in the same way as offering overtime pay at a rate of 

K l'hc piocedure described by eq.<1) must be modified along the lines discussed in n. 
9 below to apply it to Burtless's example. When this is done, my estimated increase in 
earnings is actually 2.5 peicent less than Burtless’s, so I do not think the procedure 
I suggest is likelv to overstate the labor supply response implicit in Hausman's results. 
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Welfare Cost Estimates (Selected Households) 
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U’l 

Case ($> 


I ax 

Earnings Revenue 

($) ($) 


Welfare 

Cost 


($) 


Marginal T ax 
Rate (m*) 


A 0 18 10,976 1.078 235 27,85 

B 10.00 19,662 3,474 1,883 28 0 

C 4.00 8,000 2,080 1,208 33 85 


Not kcj —I talisman (1981a) table 3 and p 37. cxirpl for ihr inai filial tax rales, vs huh i has e calc u la led fumi 
lulurmatum m t talisman's article 


39 percent (0.‘28/0.72) above ihe worker’s initial net marginal wage 
rate. We can then think about whether a typical high-wage male 
worker would respond to an overtime offer that is less than time and a 
half by increasing work effort from 40 to 67 hours a week.'* 

We can take this analysis one step further. As pointed out in the 
previous section, Hausman does not estimate the total welfare cost of 
all taxes on labor income, but only the part that he attributes to the 
federal income and employee portion of the social security payroll 
tax. Suppose we consider the labor supply response that would be 
produced if a lump-sum tax is substituted for all taxes on labor in¬ 
come, the theoretically more appropriate framework that most ana¬ 
lysts have used. Assuming that the compensated labor supply curve is 
linear over this range, we can use the compensated labor supply elas¬ 
ticities implied by Hausman’s results to estimate the effects. Taking m, 
to be 0.12 for cases A and C and 0.08 for case B (who is above the 
social security ceiling on taxable earnings), the results based on an 
initial 40-hour work week are 49.2, 75.9. and 90.0 hours, respectively, 
for A, B, and C. 

As suggested above, such labor responses do not appear credible. 
Can we believe that a typical high-wage male will, when his marginal 
tax rate is reduced to zero, adopt a work schedule of 7 days a week, 11 
hours a day? Or that a typical average-wage married female will adopt 
a work schedule of 7 days a week, 13 hours a day? Yet these behav¬ 
ioral responses are crucial to Hausman’s overall results: more than 
half of his estimated welfare cost comes from high-wage workers, 
among whom these two examples are supposedly typical. 


'' The analogy to ovemme pay is not exact because the compensated sttpplv curse in 
fig 1 is intended to keep the work on his initial indifference curve, while overtime pay 
permits attainment of a higher indifference curve. The only difference in labor supply 
in these two cases is. however, due to the income effect of the welfare cost itself. Using 
Hausman’s estimated mean coefficients on his virtual income variable, it turns out that 
the difference tn labor supply in these two situations (or cases A. and C is small. 
Hausman does not give the coefficient that is relevant for case B. 
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t u (>e mentioned that in these three examples 
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;~m federali- «* »•»»“ Hau ™ an do ' s , no ' 

l„„ i, i» dear from .he income lax tables reproduced ,n 
his uapei (198 Jr/). However, the welfare cost estimates are based on 
the subst.iuiK.il of a lump-sum tax for both the income tax and em¬ 
ployee portion of the social security payroll tax. In reporting' the 
welfare cost as a percentage of tax revenue, it would seem more 
appropriate to use the tax revenue from both these taxes. In these 
three examples, Hausman does not do this, but reports, for example, 
that the welfare cost for case B is 54.2 percent of tax revenue. If he 
follows this same practice when he gives his full-sample results—that 
the welfare cost is 29 percent of tax revenue—he would seem to be 
exaggerating an already overlarge estimate. 10 (The correct figure 
would be about 21 percent.) It should be emphasized that whether tax 
revenues ate correctly measured is irrelevant to the sizes of the labor 
supply responses. 

A final point is also of importance. In Hausman’s work, as well as in 
the analysis here, it has implicitly been assumed that the marginal 
social product of additional hours of work is constant and equal to the 
market wage rate. While this assumption may be a plausible 
simplification lor small changes in hours of work, it is untenable for 
changes ol the magnitude Haustnan’s results depend on. There are 
two reasons to expect the marginal social product of labor to decline 
as work effort increases. First, boredom and fatigue are certain to 
reduce the quality of work effort as hours exceed 60 or 70 per week. 
Second, an increase in the labor-capital ratio results from an econo- 
mvwide increase in labor supply and implies a reduced marginal 
product even if the quality of effort does not change. These two 
factors together imply that it is important to use a downward-sloping 
demand curve for hours of work. 

Figure 2 can be used to illustrate the importance of this point. T he 
worker initially confronts a 35 percent marginal tax rate and is in 
equilibrium at point A. Suppose that the marginal social product 
curve is unit elastic throughout; this is shown as d. Curve Sj is a linear 
compensated supply curve with an elasticity of 0.2 at point C—a plau¬ 
sible value for a married prime-age male, in my view. If the welfare 
cost is calculated under the assumption that to is constant, the result is 


10 Hausman stales chat the welfare cost for husbands is 28.7 percent of tax revenue in 
three of his papers (1981a, p. 61; 19816, p. 198; 1983a, p. 66). In the most recent 
paper, however, the ratio is given as 22.1 percent (19836, p. 51) without any explana¬ 
tion hn the change. This is particularly puzzling since the table that contains the 22.1 
pei cent overall hgure also gives the same welfare cost to tax revenue ratios for each of 
the five quintiles as those given in the earlier papers 
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area ABC , but the true welfare cost is area ADC. For the elasticities 
assumed, area ADC is 76 percent as large as area ABC. Thus, the true 
value is 24 percent less than the conventional estimate would imply. 
Of course, this problem is well known (Browning 1976, p. 286, n. 4), 
but most economists have not emphasized it since the overestimate is 
moderate as long as the supply elasticity is small relative to the de¬ 
mand elasticity—the situation thought to prevail. 

The situation is quite different if the compensated supply curve is 
highly elastic, as in Hausman’s work. Curve S? has an elasticity of 1.76 
at point C, the implicit elasticity for the high-wage male worker in 
Hausman's examples. The conventional welfare cost estimate is now 
area ARC, and the true value is area ATC. In this case, area ATC is 29 
percent of area ARC. Thus, the conventional procedure, which is 
used by Hausman, overestimates the true welfare cost in this case by 
245 percent. 

Therefore, 1 suggest that there are two fundamental Haws in Haus¬ 
man’s welfare cost estimates (in addition to the point discussed in the 
previous section). The first is that the implied labor supply responses 
are too large to be believable without some very compelling evidence. 
Second, even if we accept these behavioral implications, it is f urther 
necessary to assume that the marginal social product of labor does not 
decline even when the workweek exceeds 70 or 80 hours a week. 
Hausman’s results depend on both these premises, and neither is 
intuitively plausible. 



1034 JOURNAL OF POLITICAL ECONOMV 

References 

Boskin, Michael J. “Comments.” In How Taxes Affect Economic Behavior edited 
by Henrv J. Aaron and Joseph A. Pechman. Washington: Brookings Inst., 
1981. 

Browning, Kdgar K “The Marginal Cost of Public Funds." J.P.E. 84 (April 
197b): 283-98. 

Buitless. Gary “Comments.” In How Taxes Affect Economic Behavior, edited bv 
Henry ). Aaron and Joseph A. Pechman. Washington: Brookings Inst., 
1981. 

Ilatberger. Arnold C. “Taxation, Resource Allocation, and Welfare." In The 
Role of Duett and Indirect Taxes in the Federal Revenue System. Princeton. N.J.: 
Piintelon L'niv. Press (for N.B.E.R. and Brookings Inst.), 1964. 

liausman, Jerry A. “Labor Supply." In How Taxes Affect Economic Behavior, 
edited by Henrv J. Aaron and Joseph A. Pechman. Washington- Brookings 
Inst., 1981 (a) 

—- “Income and Payioll Tax Policy and Labor Supply." In The Supply- 

Side E/fnt.s of Etononm Policy, edited by Laurence H. Meyer. Proceedings ol 
the 1980 Economic Policy Conference. St Louis: Center Study American 
Bus , Washington L’niv., 1981. (b) 

- "Stochastic Problems m the Simulation of Labor Supply." In tiehav- 

uniil Simulation Methods in Tax Polity Analysis, edited bv Martin Feldstein. 
Chicago- Lmv. Chicago Press (for N B F. R.). 1983. (ct) 

- "1 axes and Labor Supply" Wot king Paper no. 1102 Cambridge, 

Mass N.B.F.R., March 1983. ( b) 

1 leeknuii, James J “Comment." In Behtanmal Simulation Methods in Tax Polity 
Analysts, edited by Martin Feldstein. Chicago: L'niv. Chicago Press (for 
N.B.I-.R). 1983 

Per loll. Jelfiev M. “Discussion of the Hausman Paper." In The Supply-Side 
Effetls of l.ttummu Polity, edilecl hv Lauience H. Mever Proceedings of the 
198(1 Economic Policy Conference St. f.ouis: Center Study American Bus , 
Washington L’niv , 1981. 



Book Review 


The International Transmission of Inflation. By Michael R. Darby, James R, 
Lothian, Arthur E. Gandolfi, Anna J. Schwartz, and Alan G. Stock- 
man. 

Chicago: University of Chicago Press, for National Bureau of Economic Re¬ 
search, 1983. Pp. 727. $69.00. 


This National Bureau of Economic Research monograph is the result of a 
research project that started in 1976 and had as its objective “to accomplish 
two goals: the construction of a consistent and temporally comprehensive 
data base for the eight major industrial countries we wished to study, and 
estimation ol a theoretically sound simultaneous model that we could use to 
test and otherwise evaluate competing hypotheses about the genesis and 
spread of inflation internationally" (p. xiv). Both goals were attained, and the 
book by Darby et al. documents the findings in a comprehensive, provocative, 
and readable way. As its length (over 700 pp.) suggests, the monograph 
contains an amount of empirical results that cannot possibly be reviewed 
comprehensively in a few pages. Anyone interested in the generation and 
transmission of inflation and in the issue of the effectiveness of monetary 
policy under pegged exchange rates will find plenty ol results to ponder and 
evaluate in this volume. 

The main message that the authors want to get across is that the inflation 
that swept the industrial countries in tile late sixties and early seventies was 
fundamentally due to an excessively expansionary monetary policy in the 
United States. Other countries could not avoid importing this inflation in the 
long run as a result of pegging their exchange rales to the dollar even though 
they enjoyed substantial monetary independence in the short run. The prom¬ 
inent role ol U.S. policy must be regarded as relatively uncontroversial by 
those who take a monetary view of the inflationary process and recognize the 
special status of the dollar ill the Bretton Woods system. The empirical evi¬ 
dence presented in this volume strengthens the case for this explanation. 
More controversial is the contention that other industrialized countries 
(Canada, France, Germany, Italy, Japan, the Netherlands, and the United 
Kingdom were investigated) had “substantial" monetary independence in the 
short run even though they pegged their currencies to the dollar. Although 
never defined explicitly, the short-run concept used in this context seems to 
refer to a period ol al least 1 year. According to the authors of several 
chapters (see below for specifics), the monetary independence was the result 
ol weak transmission mechanisms through both goods and asset Hows. 1 will 
argue below .that the empirical evidence in favor ol this interpretation is not 
clear-cut and that results to the contrary can he found in the volume itself. 
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The monograph is organized into five parts: an introduction and back¬ 
ground, a description and analysis of the eight-country “N.B.E.R. Interna¬ 
tional Transmission Model,” an analysis of monetary independence and the 
degree of capital mobility, reduced-form tests of price equations and purchas¬ 
ing power parity relationships, and a conclusion. An extensive (193 pp.) and 
well-documented data appendix ends the volume. In attempting to give a 
flavor o( the extensive empirical work and to evaluate the conclusions that are 
drawn from it, I shall proceed section by section through the book. 

Pan I, entitled “Preliminaries," begins with a useful “Introduction and 
Suminaty" by Darby and Lothian, which, as the title suggests, provides a 
guide to the rest of the monograph and states the main conclusions from the 
empirical research. Next Schvvartz presents an overview of the evolution of 
the international monetary system during the 1956—76 period (the period 
spanned by the project). Unfortunately, this overview is, in my view, too 
condensed tor the uninitiated to get a grasp of the main issues involved and 
too superficial for the specialist. It would have been more useful to have a 
chapter more directly delineating the main questions and hypotheses to be 
addressed and tested in the study and to relate these to the evolution of the 
monetaiv system as well as of the received theory. The third chapter (by 
Lothian) contains a brief description of the data base and the main reasons 
why it was constructed within the project rather than collected from interna¬ 
tional sources such as the International Monetary Fund's (IMF) International 
Financial Statistics. In the final chapter of part 1 Anthony Cassese and Lothian 
ptesent a number of Granger causality tests with the aim of providing initial 
tests ol transmission hypotheses that can be used to guide the structural 
modeling effort. The outcome of these tests is a view that the transmission of 
inflation is due mainly to the domestic monetary consequences of balance of 
payments deficits or surpluses and not to direct price arbitrage. Foreign (read 
U S.) monetary shocks result in balance of payments disequilibria as a result 
ol significant, although not perfect, international capital mobility. 

Part 2 contains a description of the "Mark Ill International Transmission 
Model." the construction and analysis of which constituted one of the initial 
goals of the project Although it is not possible to discuss the specification, 
estimation, and analysis of this model in detail, a number of points need to be 
emphasized in order to provide a flavor of the nature of the undertaking this 
model represents. 

Beginning with the specification, it should be noted that no effort was made 
to differentiate between the eight submodels representing each ol the econo¬ 
mies studied in the project by incorporating special features applicable to 
each individual country. O 11 the contrary, a conscious decision was taken not 
to do so in order to render comparisons between countries easier. Of course, 
this gain in comparability is obtained at the risk of specification errors due in 
parttculai to differences in institutional practice between countries. The sub¬ 
models themselves contain 8— 10 endogenous variables and can be thought of 
as variants of an aggregate demand-aggregate supply model incorporating 
the natural rate hypothesis and a number of international transmission mech¬ 
anisms. Specifically, real income is determined by unexpected movements in 
money, government expenditures, and exports; the price level is solved from 
a money demand function that contains money supply shocks as one argu¬ 
ment, and the money supply is determined by what is called a reaction func¬ 
tion ol the central bank that depends on domestic government expenditures, 
inflation, unemployment, and the balance of payments; the nominal rate of 
interest depends on the expected rate of inflation and on the unexpected 
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values of the aggregate demand variables in the model (money, government 
expenditures, and exports). The international sector is made up of an import 
demand equation, an import supply equation (used to determine the price of 
imports), an export equation, and a capital flow equation. The flexible ex¬ 
change rate version of the model adds an intervention function and renor¬ 
malizes the import demand function and the import supply function so that 
they determine, respectively, the relative price of imports and the nominal 
exchange rate. 

Expectations enter the models in two ways. Expected values of money 
supplies, government expenditures, and exports are needed to define the 
innovations that enter the real income equations, among others. Univariate 
ARIMA processes are used for these variables. Expected inflation, which 
helps determine nominal interest rates, and expected exchange rate changes, 
which determine the expected yield on foreign assets in the capital How equa¬ 
tions, are generated by transfer functions with the lagged price level, money 
stock, and the yields on domestic and foreign assets as input series for the 
inflation rate and with the balance of payments (under pegged rates) and 
lagged values of the variables in the exchange rate equation (under floating 
rates) as input series for the exchange rate. 

Although it is possible to argue over the appropriateness of some of these 
equations (notably, the price level, which is solved from the money demand 
function rather than from the entire aggregate demand-aggregate supply 
equilibrium; the interest rate equation, which is not obtained from the im¬ 
plicit asset demand equilibrium underlying the capital How equation; and the 
policy reaction functions, which are supposed to determine money stocks 
rather than some interest rate, liquidity, or other target pursued by the cen¬ 
tral bank), they represent fairly well what might be considered a consensus 
model of the open economy, and they do contain transmission mechanisms 
via import and export functions for goods and net capital Hows of assets. In 
addinon, foreign factors are allowed to affect real income directly through an 
aggregate demand effect, and the relative price ol oil is included as a determi¬ 
nant of import prices and, in an extended real income equation, as a direct 
influence on domestic output. 

The model was estimated with quarterly data for the eight countries men¬ 
tioned above. The sample period was 1957:1-1976:1V lor all equations ex¬ 
cept those (import demand, import supply, and intervention equations) that 
were different (renormalized) for the fixed and floating rate periods. The 
estimation method was two-stage least squares in which principal components 
were used in the first stage in order to conserve degrees of freedom. 

The authors of the chapters on the specification and estimation of the 
model (Darby and Stockman) summarize the main empirical results “by the 
statement that linkages among countries joined by pegged exchange rates 
appear to be much looser or more elusive than has been assumed in many 
previous studies” (p. 120). The reasons for this can be attributed to three 
properties of the estimated model; "First, relative price effects on the balance 
of trade are not large, although they increase over time. Second, the effect of 
the balance of payments on the domestic money supply and hence on domes¬ 
tic prices is small and operates with a lag. This reflects the apparent practice 
of sterilization of contemporaneous reserve Hows” (pp. 120-21). Third, "in¬ 
ternational capital flows do not appear to be very well related to interest 
differentials” (p. 121). Among the many other conclusions the authors men¬ 
tion, two will turn out to be of special importance in my evaluation of the 
claim that the Bretton Woods system afforded substantial short-run monetary 



lOVRNAL OF POLITICAL ECONOMY 

i o-jfl 

mdeuemlence tor non-reserve-currency countries despite the pegged ex- 
range rites. These are that the effects of money shocks on rea/income are 
vet v weak (in several cases nil) in countries other than the United States and 
that money seems to play a shock absorber role in the sense that money 
supply innovations have a positive influence on money demand. 

In two additional chapters, Darby conducts a number of simulation experi¬ 
ments designed to provide evidence about the domestic and external effects 
of policies in the United States and in the other countries and about the 
effects of changes in the price of oil. In the first, money supply and govern¬ 
ment expenditure shocks for the United States, Germany, and the United 
Kingdom were considered. The French, Italian, and Japanese submodels 
were not included in these simulations as apparently no useful results were 
obtained for these countries (the results were "erratic" and “inexplicable," 
piesuinubly because of "peculiar estimated coefficients" fp. 170]). No useful 
results for any country could be obtained from the floating rate version of the 
model as a result of dynamic instability. As to the results, "the simulations 
confirm the apparent implications of the Mark III estimates: International 
transmission of inflation through money flows is a weak and slow process even 
under pegged exchange rates, with non-reserve countries exercising consid¬ 
erable short-run control of their money supplies" (p. 201). 

In a second chaplet. Darby simulates the effects on the mode) of an oil 
price shock. He documents substantial negative real income effects of an 
increase in the relative price of oil, but since the simulated model is dynami¬ 
cally unstable, I would not (and neither does Darby himself) want to place loo 
much ciedence on these results. 


In the final chapter of this part of the book, Darby investigates the ptoper- 
ties of the real income equation incorporated in the model, f he main pur¬ 
pose of (he investigation is to test whether it is unanticipated or actual values 
of the aggregate demand variables that belong in that equation. Three find¬ 
ings emerge that are important foi the overall conclusions of the volume: (i) u 
is only the U.S. equation that fits reasonably well; (ii) money shocks are not 
important except in the United Stales and to a lesser extent in Germany, Italy, 
and the Netherlands; and (lii) it makes very little difference (again except for 
the United Stales) il one uses anticipated instead ol unanticipated money in 
these equations. 

Parts 3 and 4 contain a number of empirical studies that are based not on 
die simultaneous model developed in part 2 hut on smaller simultaneous 
models and on reduced forms. Part 3 deals with the question of monetary 
control under the Bietton Woods system (and of ihe effectiveness of sterilized 
interventions under floating). It includes four chapters: one by Darby on 
siei ili/ation of reset ve flows undet the Bretlon Woods regime using a re- 
dttced-fortn test; one by Daniel Laskat estimating "offset coefficients" in capi¬ 
tal flow equations and sterilization coefficients in reaction functions using 
both reduced forms and a small simultaneous equation model; one by 
Michael Melvin on the proper specification and estimation of capital flow 
equations; and one (theoretical) by Dan Lee on the relative ellectiveness oT 
open market operations (OMOs) and foreign exchange market operations 
(FEMOs) under floating rates. Results ol the first three will be discussed when 
I evaluate the conclusion of the volume with respect to the question of mone¬ 
tary control of non-reserve-currency countries. Lee’s paper (apparently writ¬ 
ten in 1980) develops what must now be considered “standard" results from 
portfolio balance models with imperfect substitutability between domestic 
and foreign assets: OMOs have a relatively stronger effect on the domestic 
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interest rate than FEMOs, and the reverse is true as far as the effect on the 
exchange rate is concerned. Which of the two has a stronger effect on output 
depends on the relative strengths of interest rate and exchange rate effects on 
aggregate demand. It should be noted that expectations (of inflation and 
exchange rate changes) are modeled as adaptive processes in the paper, a tact 
that may influence the generality of these results. 

Fart 4 deals with various aspects of the inflation process in the eight coun¬ 
tries. In one chapter Gandolfi and Lothian estimate price equations derived 
from a demand for money function and a Lucas supply function. They con¬ 
clude “that movements in domestic money in all eight countries serve as the 
key link in the process leading to changes in the domestic rates of inflation." 
They also conclude that foreign inflation (measured by U.S. inflation or a 
rest-of-the-world inflation rate) “had a direct impact on domestic rates of 
inflation in relatively few of the comparisons we made” (p. 438). While the 
former of these conclusions can be found also m other studies of the transmis¬ 
sion of inflation, the latter implies a smaller direct effect of foreign prices 
than has been found in, for instance, the individual country studies by Kor- 
teweg for the Netherlands, Fourcans for France, and Fratianni for Italy in 
Brunner and Meltzer (1978) and in the multicountry study by Cross and 
L.aidler (1976). Possible explanations for this different e will be offered below. 

In another chapter Darby argues that U.S. monetary growth was exoge¬ 
nous relative to rest-of-the-world factors as a result of the operation of the 
dollar standard during much of the Brelton Woods period. He presents 
evidence consistent with this view. This evidence is based on money supply 
equations for U.S. dollars in which three specific measures of the U'.S. balance 
of payments fail to enter as significant explanatory variables. Darby also pre¬ 
sents some evidence suggesting that U.S. long-term inflation is to a dominant 
extent explained by the growth rate of the U.S. money supply. He therefore 
concludes that U.S. inflation (and inflation in countries that kept their ex¬ 
change rates fixed with respect to the dollar) was the result of the excessively 
expansionary monetary policy in the United Stales. 

1 he findings concerning the special role played by the United States under 
the Bretton Woods system are in conformity with the theoretical results de¬ 
rived bv Swoboda (1978). The empirical evidence on the same issue presented 
in Genberg and Swoboda (1982) confirms the dominance of U.S. monetary 
policy for the evolution of the world money supply and of the aggregate 
money supply in the countries other than the United States. Contrary to 
Darbv, however, Genberg and Swoboda reject the hypothesis ot complete 
dominance by the United States. A possible reason for the difference between 
these findings is that the studies do not use the same measures of the external 
influence on the U.S. money supply. 

In a third chapter in this part of the volume, Darby documents the recently 
noted tendency for deviations from purchasing power parity to follow a pro¬ 
cess (a moving-average adjustment process appears to be present) close to a 
random walk. This implies that there is no value toward which these devia¬ 
tions will tend in the long run. Darby’s model and results also imply that “the 
further ahead we make predictions of [these deviations], the greater the 
[forecast error] variance. On the other hand, the longer period over which we 
predict the average growth rate of [the deviations], the smaller is the vari¬ 
ance” (p. 470). 

In part 5 Darby and Lothian draw the major conclusions of the volume. 
They group these under four headings: "Money versus Special Factors” as 
causes of inflation, the “Channels of Transmission" of inflation during the 
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Bretton Woods era. “Sterilization and Monetary Control" by 
tral banks, and “Monetary Policy and the International Monetary 

■J™“'* , » **■*• Jpsrinni both in the Uatoi SuaS”' 

«ja*as 

S2 { p fSH COttWfk ** but that impact was neither continual nor 

Ahhmtgh I have no particular disagreement with this interpretation of the 
result* it must be noted as1 the author, do. that the sample period covered, n 
the «ud»es contatm only the first of the two oil shocks that occurred ,n the 
seventies. This, and the fact that the relative price change that took place then 
-as «, large, may make u very difficult to isolate its impact by traditional 
t\tntonmnc methods. 

I he second set of conclusions concerns the transmission mechanisms that 
fink national nidation rates in a fixed exchange rate environment. Here 11 is 
al^ued that the forces that render countries interdependent are relatively 
weak and considerably slower than commonly thought. It is claimed that 
mission due to substitutability between goods produced in different 
11 .instill. s« . relative price effects in the export and import 

c outlines is we P afjd because direct foreign price 

are of Httje significance As already 
noted, this evidence is contrary to other recent evidence obtained for a partI) 
oxerlapping set of countries, namely the Netherlands. France, and Italy, am 
to, other small open European economies. The differences may stem from a 
number of sources. One candidate is the specification of the price equation in 
die Mark III model. No measure of foreign prices enters this equation di¬ 
rectly contrary to the case in the other studies. A second possible reason may 
be that foreign inflation is not allowed to influence domestic inffationaty 
expectations in the manner suggested and documented empirically in Cross 
and Laidler (1976). A third possibility might be the choice of foreign price 
variable for the reduced-form price equations. Import prices, for instance, 
might have been more appropriate for this purpose than the U.S. general 
price level. 

Although I remain unconvinced that price effects are as unimportant as 
Darby and Lothian claim, I do go along with their judgment that the strict 
“law-of-one-price" paradigm must be used with care in short-run interna¬ 
tional modeling as far as goods market linkages are concerned. Where 1 am 
more in disagreement with Darby and Lothian is in their evaluation of the 
evidence with respect to international asset substitution. lo a large extent 
their conclusion is based on the small interest rate effects on capital flows that 
are obtained in the Mark III model. This is in direct contradiction lo the 
relatively large coefficients that Laskar obtains in a very careful and inter¬ 
esting chapter in part 3 of the book. Laskar estimates so-called offset coeffi¬ 
cients in reduced-form capital flow equations that vary from a low between 
-0.40 and -0.50 for Japan (the validity of an even lower value for the 
United Kingdom is questioned by the author) to a high around —0.90 for 
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tesuiis of capital conuoh in the earty pan ot the «mp*e 
„ a l aismantUng of these controls render* thu rxptouiwm ht^W tmphtW*** 
lor the flexible rate period. For the same reason it is tmbWeh that the awum^s- 
uon of weakly integrated asset markets is appropriate il one ts interested in 
predicting the effects on monetary independence of a return to pegged ex¬ 
change rates under present circumstances. 

A frequently slated conclusion, not only in the final chapter but in several 
others, is that the Bretton Woods system of pegged exchange rates allowed a 
substantial degree of short-run monetary independence for the partic ipating 
tountries. Weak transmission effects through goods and asset Hows are 
pointed to as explanations for this independence. As we have seen, however, 
thete are good reasons to question the empirical basis for the assumption of 
weak transmission mechanisms. Possibly in anticipation of doubts of the na¬ 
ture raised above, the case for monetary independence is also buili on esti¬ 
mates of money supply reaction functions for the non-reserve-currency coun¬ 
tries in the sample. An interesting test is proposed by Darbv. He shows that if 
the stoi k of money in a country is completely demand determined, then 
variables other than the balance of payments, normally thought of as deter¬ 
mining the behavior of the central bank, should not be significant in an 
equation explaining the actual stock of money in existence. Empirical evi¬ 
dence suggests that such variables do indeed have some explanatory power, 
leading IJarby to the conclusion that monetary control was exercised bv the 
countries studied. Interesting as it is, this test seems to break down it. as 
concluded from the estimation of the Mark III model, the Darby-Uarr view of 
the shock absorber role of money is accepted. According to that view, innova¬ 
tions in the money supply should appear in the money demand function, ren¬ 
dering the proposed test for monetary independence unable to do what it is 
asked to do. 


My evaluation of the empirical evidence presented here as well as elsewhere 
in the literature is that the degree of financial integration among the major 
industrialized countries is substantial. 1 base this judgment on, among other 
factors, the size of offset coefficients in capital How equations and the strength 
of correlations between interest rate movements in different counmes. To¬ 
gether with evidence showing that domestic discretionarv monetary policy 
has quite small effects on domestic economic activity in countries other than 
the United States, this leads me to conclude that the degree of monetary 
independence under the Bretton Woods period was much smaller than ar¬ 
gued in the conclusion to this volume. I therefore agree with Darby and 
Lothian that the breakdown of that system was due to incompatible goals of 
monetary policy in the participating countries. Where 1 part company with 
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them is in their contention that cumulative price level discrepancies brought 
about by these different goals were the proximate causes for the breakdown. I 
maintain that the existing evidence is more consistent with the view that high 
and increasing capital mobility combined with substantial sterilization of re¬ 
serve flows on the part of central hanks was the real reason. 

Despite rny disagreement with some of the conclusions drawn from the 
wot k reported in this book, I can recommend it to anyone who is interested in 
empirical studies on international macroeconomic questions. Careful empir¬ 
ical woi k is always welcome, and in this respect this volume has much to offer. 

A few mmot complaints and warnings should be registered in conclusion. 
In several places, but most notably in a chapter by Darby on sterilization and 
monetary control, there is an unhelpful confusion between the monetary 
approach to balance of payments and exchange rate analysis as an “approach" 
and specialized versions of that approach that have been used in the litera- 
tuie. Mote important, the references in the book are at limes somewhat 
dated; very few articles published after 1980 are included. This can he ex¬ 
plained hy ihe long period between the completion of the manuscript (1981) 
and the publication date (1988), but it is on occasion mildly annoying. It 
should also be noted that at least eight of the 17 chapters in ihe book have 
lieen published in what appear to be similar forms elsewhere. I his diminishes 
to some extent the originality of the material in the volume as such, but I still 
consider it valuable to have the empirical evidence it contains collected in one 
place. 

One (annul end a review of this book wiihout congratulating the authors 
(Lothian with the assistance of Anthony Cassese and Laura Nowak) for the 
extremely comprehensive and detailed data appendix. Not only do they pre¬ 
sent the actual series used for the eight countries, but they also document 
them very well. Unfortunately, it does not appear to be a simple matter to 
update the data set beyond 197b by using readily available sources. It is 
mentioned that Lothian is currently attempting to alleviate this shortcoming. 
Lor all those who are engaged in empirical work on international mac- 
1 ((economics I hope that these additional data will become available in pub¬ 
lished form. Combined with those published in ibis volume, they would con¬ 
stitute an extremely valuable source for empirical researchers. 

Hans Genberc, 

Guidiiale Institute of International Studies 
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Although recent research suggests that intergenerational transfers 
play an important role in aggregate capital accumulation, our under¬ 
standing of bequest motives remains incomplete. We develop a sim¬ 
ple model of strategic bequests in which a testator influences the 
decisions of his beneficiaries by holding wealth in bequeathable 
forms and by conditioning the division of bequests on the benefi¬ 
ciaries’ actions. The model generates falsifiable empirical predictions 
that are inconsistent with other theories of intergenerational transfers. 
We present econometric and other evidence that strongly suggests 
that bequests are often used as compensation for services rendered 
by beneficiaries. 


Tell me, my daughters 

(Since now we will divest us both of rule, 

Interest of territory, cares of state), 

Which of you shall we say doth love us most, 

That we our largest bounty may extend 
Where nature doth with merit challenge. 

[Shakespeare, King Lear] 

We would like to thank Robert Barro. Gary Becker, John Bound, Victor Fuchs, 
Mervyn A. King, |ames M. Poterba, Robert Vishny, Robert Waldmann, as well as 
seminar participants at Brown, Chicago, Harvard, MIT, North Caiolma State, and 
Stanford for helpful comments. Financial support from ihe NSF is gratefully acknowl¬ 
edged. The views expressed here are ours and should not be attributed to anv or¬ 
ganization 
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Much recent research suggests that intergenerational transfers play 
an important role in aggregate capital accumulation. Kotlikoff and 
Summers (1981) estimate that about four-fifths of U.S. wealth ac¬ 
cumulation is due to intergenerational transfers. 1 Several other stud¬ 
ies, including Brittain (1978), Mirer (1979), and Bernheim (1984(>), 
have found that the savings behavior of retirees is inconsistent with 
strong forms of the life-cycle hypothesis.' While intergenerational 
transfers appear to be of central importance in understanding pat¬ 
terns of capital accumulation and familial behavior, relatively little is 
known about what motivates individuals to leave bequests. 

In this paper we develop a model of “strategic” bequests and pre¬ 
sent some preliminary emjnrical tests of it. I'he central premise 
underlying our formulation is that testators use bequests to influence 
the behavior of potential beneficiaries. Such influence may be overt, 
as when parents threaten to disinherit miscreant offspring, or more 
subtle, as when parents reward more attentive children with family 
heirlooms. As we discuss below, models of strategic bequests have 
very dif ferent implications for the effects of institutions such as social 
security and private pensions on the rate of capital formation, and 
individual behavior more generally, than do alternative models. In 
our model, the Ricardian equivalence theorem of Barro (1974) does 
not hold. 

In our theoretical formulation, we envision a testator who, though 
altruistic, is also affected by actions taken individually by a number of 
potential beneficiaries (he may, e.g., enjoy receiving attention from 
his children). We argue that, in such circumstances, the testator will 
necessarily wish to influence his beneficiaries’ decisions by condition¬ 
ing the division of bequests (perhaps through informal means) on 
actions they take. However, he is constrained in this regard by consid¬ 
erations of credibility: he cannot, for example, credibly threaten uni¬ 
versal disinheritance. We show that as long as there are at least two 
credible beneficiaries, it is possible for the testator to devise a simple, 
intuitively appealing bequest rule that overcomes the problem of 
credibility and allows hint to appropriate all surplus generated from 


1 The significance (it intergenerational transfers is still the subject of milch debate. 
7ohm (1967) and Davies (1981) present simulation results that indicate that pure lile- 
cvtle motives die sufficient to account for the hulk of U S. capital 

J Brittain (1978) and Mirer (1979) document continued accumulation of wealth after 
retirement Sliorrocks (1975), Diamond and Hausman (1982), and King and Dicks- 
Mireaux (1982) hnd limited evidence to dispute this claim. Bernheim (1984fr) confirms 
this finding but demonstrates that behavioral responses of rates of decumulation to 
nondiMietionary annuities art inconsistent with tfie predictions of simple life-cycle 
models. 
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testator-beneficiary interaction. This surplus provides an incentive 
for the testator to forgo other forms of consumption or, alternatively, 
to hold his wealth in bequeathable rather than annuitized forms. 

No single tractable analytic model can capture as varied a phenom¬ 
enon as intergenerational transfers. We believe, however, that the 
model of strategic bequests presented here is a valuable supplement 
to conventional formulations that rely on ad hoc bequest motives or 
intergenerational altruism. In particular, our model helps to explain 
several empirical observations that seem inconsistent with other for¬ 
mulations. Furthermore, it generates falsifiable empirical predictions, 
thereby lending itself to econometric testing. 

The notion that anticipated bequests may influence the behavior of 
potential beneficiaries has previously received varying amounts of 
attention from Sussman, Cates, and Smith (1970), Barro (1974, 
p. 1106, n. 14), Becker (1974, 1981), Beti-Porath (1978), Adams 
(1980), Kotlikoff and Spivak (1981), and Tomes (1981). However, 
with the exception of Becker’s work, these studies lack a complete 
model of the exchange process. Typically, it is implicitly assumed that 
unwritten agreements between family members are perfectly enforce¬ 
able. By explicitly modeling the strategic choices of parties to such 
agreements, we generate sharp, empirically testable predictions con¬ 
cerning the circumstances under which these agreements are enforce¬ 
able. 

Becker considers a world in which enforceability is noi an issue. His 
“rotten kid theorem” establishes that under certain circumstances 
automatic changes in an altruistic parent’s transfers to selfish children 
provide these children with optimal incentives. Thus the parent has 
no strategic motive—he does not wish to precommit to some alterna¬ 
tive compensation scheme. A further consequence of this environ¬ 
ment is that forced transfers between children or between children 
and parents have no ultimate effects and that the Ricardian equiva¬ 
lence theorem holds. However, the rotten kid theorem depends criti¬ 
cally on the assumption that monetary income alone determines each 
agent’s well-being. In our more general framework, automatic trans¬ 
fers are insufficient and strategic behavior comes into play. 

The paper is organized as follows. Section I presents our model of 
strategic bequests and characterizes its solution. In Section II we pre¬ 
sent econometric evidence on bequeathable assets and beneficiary be¬ 
havior that supports the model. Section III discusses the ability of 
various bequest theories to account for certain stylized facts. Finally, 
in Section IV we examine some implications of our model for issues 
such as the effect of social security, government debt, and private 
pensions on capital formation and family behavior. 
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The notion that rules governing the division of bequests may in¬ 
fluence actions taken by potential beneficiaries is not a new one. In¬ 
deed, Becker (1974, 1981) has argued that variations in transfers to 
selfish children force these children to consider their parents’ inter¬ 
ests. This is the basis for his rotten kid theorem, which establishes that 
each beneficiary, no matter how selfish, maximizes the total family 
income available to the altruistic benefactor. In effect, the benefac¬ 
tor’s welfare is maximized even though he does not design a rule for 
dividing his estate with the intent of providing incentives for beneficiaries. 

In this paper, we advance and test a theory of bequests in which 
the testator intentionally manipulates the behavior of his beneficiaries 
through his choice of a rule for dividing his estate. The fundamental 
difference between Becker’s model and ours is that in our model 
influence is strategic, while in Becker’s it is nonstrategic. As we shall 
see, this distinction may be of profound importance in a variety of 
policy-related contexts. 

It may appear that there is no need for a theory of strategic in¬ 
fluence: if the rotten kid theorem holds, an altruistic testator cannot 
improve on the incentives created by automatic (nonstrategic) varia¬ 
tions in his bequests. Thus Section M is devoted to an analysis of the 
following question: When would a testator wish to improve on these 
automatic incentives (i.e., behave strategically)? We observe that the 
validity of Becker’s theorem is limited to cases in which the welfare of 
testators and beneficiaries alike depends only on monetary income. In 
particular, if an altruistic testator also cares directly about some action 
taken by a potential beneficiary (as seems inevitable), nonstrategic 
incentives will not produce an efficient outcome, let alone maximize 
the testator’s well-being. 

Having established a motive for the exercise of strategic influence 
on the part of testators, w'e turn to a second question: When can 
testators successfully manipulate behavior, and how do they do so? 
We argue in Section 1/1 that success in this regard requires the testator 
to have at least two individuals (or institutions) whom he could cred¬ 
ibly name as beneficiaries. When this condition is satisfied, testators 
can design a rule for dividing bequests that extracts the full surplus 
generated from interactions with beneficiaries. Otherwise the testator 
cannot successfully precommit himself to any scheme that improves 
on automatic incentives. 

A. Altruism and Strategic Interaction 

We begin with a brief exposition of the rotten kid theorem. Suppose 
there are two agents, a parent ( p) and a child (k). The child is selfish in 



STRATEGIC BEQUEST MOTIVE 


lo 49 


the sense that his utility is a function only of his own consumption, r*. 
The parent, on the other hand, is altruistic in the sense that he cares 
both about his own consumption, c f , and about his child’s utility. We 
write these utility functions as T*(c>) and U /r [Cp, t7*(c*)], respectively. 

Suppose that agent ; (i = k, p) has income y,. Suppose also that the 
child takes some action that affects both y* and y^ subsequent to this 
choice, p makes a utility-maximizing transfer to k. The rotten kid 
theorem states that k will choose an action that maximizes total family 
income, y * y* 4- y fl (and therefore that maximizes the parent’s wel¬ 
fare). 

We illustrate this principle in figure I. Let y 1 and y 2 be two distinct 
levels of family income. Each defines an opportunity set for the par¬ 
ent in terms of achievable levels of r* and c t> . As long as the parent is 
not at a corner (his transfer to the child is positive), the allocation of 
consumption between p and k is determined as the tangency between 
these budget constraints and the parent's indifference curves. Thus, 
assuming r* is a normal good for the parent, the child maximizes his 
own welfare by choosing an action that maximizes y. In this context, 
the parent has no need to manipulate his child’s decision since auto¬ 
matic adjustments in transfers create optimal incentives. It should 
also be apparent from figure 1 that a forced transfer from p to k will 
have no effect on either’s level of consumption as long as the par¬ 
ent is not at a corner. This is the basis of the Ricardian equivalence 
theorem. 

However, even if testators care about the well-being of their 
beneficiaries, they may not be perfect altruists. Examples abound: An 
individual might desire more attention from his own children, object 
to a relative’s choice of spouse, or want to be cared for by a sibling or 
grandchild. Institutions (such as universities) commonly treat wealthy 
patrons particularly well, perhaps to encourage further support in 
the form of gifts and bequests. In such cases, the rotten kid theorem 
does not hold. 

For concreteness we consider our previous example, modified as 
follows. First, y t , and y* are fixed. Second, the child takes some action, 
a, which we will think of as attention given to the parent (visits, care, 
etc.). Both the parent and the child care about a directly. Thus, 
utilities are given by (/*(c*, a) and Up[c t „ a, r , a)]. 

We assume that the child’s utility first increases and then decreases 
in a. The parent’s utility, with t/* kept constant, always increases in a 
initially, though it may decline in a for high enough attention levels. 
In fact, we assume something stronger, namely, that parents lire of 
attention only after children do, if at all (i.e., dUpfda > 0 when dU*/da 
3= 0). Finally, in this exposition, we also suppose that the parent’s 
overall utility (i.e., allowing for the effect of a on [/*) declines in a for 
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Fit. 1 —An illustration of the rotten kid theorem 


high enough a, either because he gets tired of attention or because 
Ins content for his child’s disutility dominates his direct desire for 
attention. 

As before, the parent selects a transfer to the child subsequent to 
the child’s choice of a. What is the resulting allocation of consumption 
and attention? 

We illustrate the solution to this problem in figure 2. By substitut¬ 
ing fot 1 1 , (= -v — c k ) in s utility function, we can represent p’s 
preferences entirely in the («, </,) plane. Point D represents p's global 
maximum; Ip, Ip, and Ip are indif ference curves for successively lower 
levels of utility. 

For any level of a, how will p divide resources between himself and 
h} The answer to this question is determined by drawing a vertical line 
at the relevant value of a and locating a tangency with one of p's 
indif ference curves. This procedure generates an optimal response 
function for p, One cannot say, a priori, whether this curve 

slopes upward or downward. If the level of u does not affect p’s 
marginal rate of substitution between r* and cy„ r**(«) will be constant, 
at some level o*, as shown.' In this special case, automatic variations in 
transfers provide absolutely no incentives for the child! 

Anticipating the parent's response (or lack thereof), the child effec¬ 
tively chooses a point on <r*(a). We therefore superimpose the child’s 


’ Foi example, il a, l' A ) = and U t (r t ,ri) = (r k )*g(a), then c**(«) will 

he flat. 




indifference curves on the diagram (/*, if) and look for a tangency 
with c**(a). This is given by point A. 

Where does A lie relative to D? Consider a small horizontal move¬ 
ment to the right from A along f>*(a). By definition, this does not 
affect k’s utility; Cp is unaffected, and a increases. Since the parent still 
wants some more attention for his own pleasure, this movement 
strictly increases his utility. 1'hus, D must lie to the right of A, as 
shown. Put another way, if D were to the left of A, then a slight 
increase in a would be resisted by the parent but would not matter to 
the child: a possibility we do not admit. 

This argument can be made more formally. Consider the derivative 

dUp = dU L dU L / dUj^ + ac*_ _3c> \ + au^_ Bc p 
da da dUi, \ da dc * da J dc p da 

dUi/da = 0 since the child is at his optimum; dc>Jda = 0 since we are on 
the parent’s optimal response schedule; and dc p lda = 0 since that 
schedule is horizontal. Hence if dUp/da > 0 (and we assume that it is 
when dUpIda = 0), then dUp/da > 0. 

Note, further that since A lies on r**(a), p's indifference curve 
through this point ( l p ) must be vertical. Since fc’s indifference curve 
through A is horizontal, we know that Pareto improvements are pos¬ 
sible. In particular, the shaded lens in figure 2 represents the set of 
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Pareto improvements (B is p’s most preferred point in the set). Note 
that all Pareto-improving points involve a larger transfer and more 
attention than does the autarkic point A. 

We summarize these results as follows: Suppose that an altruistic 
parent cares directly about the level of attention supplied by his child. 
Suppose also that the level of attention chosen does not affect the 
parent's marginal rate of substitution between his child’s consumption 
and his own consumption. Finally, suppose the parent would, as a 
direct effect, prefer more attention whenever the child wants to sup¬ 
ply more attention. Then, regardless of how much the child likes to 
supply attention, the level of attention induced by automatic incen¬ 
tives is less than the parent’s global optimum. Furthermore, Pareto 
improvements are available, all of which entail mitre attention and a 
larger transfer to the child. 

Under these circumstances the parent will not want to respond 
passively to the child’s choice. If possible, he will precommit himself to 
something other than the automatic incentive scheme. Suppose, for 
the moment, that a threat to disinherit the child, leaving him with 
consumption, r*, is credible. What should the parent do? Once again 
refer to figure 2. If disinherited, the child would pick point E. For his 
threat to be effective, the parent must insist on an action that leaves 
the child at least as well off as he was at E. Point C satisfies this 
condition and yields the parent more utility than any other such 
point. Thus the parent would offer the child point C, with the threat 
of disinheritance if the child ref uses. In Section IB we will discuss the 
credibility of such a threat. 

So far we have considered only the case where r**(«) is flat; as men¬ 
tioned, it may slope upward or downward. If it slopes downward, our 
summarized results still hold. If it slopes upward, A may lie to the 
ng/it of D. ' However, except by accident, A and D would not coincide. 
Thus, there is almost always a potential for Pareto improvements and 
an incentive to engage in strategic manipulation. 

In our environment it is easily possible for the Ricardian equiva¬ 
lence theorem to fail even while parents are making altruistically 
motivated bequests to their children. Phis will occur as long as chil¬ 
dren do not prefer their parents’ bliss point to being disinherited, so 
that marginal increases in parental wealth are used to extract extra 
attention from children?’ To illustrate these assertions, return once 


' Intuitively, this situation could arise as follows. Suppose that whenever the child 
visits him the parent feels obliged to give the child expensive gilts The child may take 
advantage of this situation by visiting the parent more often than the parent wishes If 
possible, rhe parent would want to commit himsell to a policy ol giving smallei gifts. 

5 Below we present empirical evidence that suggests that typical parents are not at 
their bliss point in terms of attention. 



STRATEGIC BEQUEST MOTIVE 


10 53 

again to figure 2. Redistributions between parent and child simply 
shift the level of e*. Thus, if parents rely on the automatic incentive 
scheme cf(a) (as in Becker’s model) and if corner constraints do not 
come into play, such transfers will plainly have no real effects. How¬ 
ever, in our model, reducing c* (perhaps through social security or 
debt) changes the parent’s effective threat (point E) and therefore 
alters the solution (point C). On the margin, bequests are used to 
"purchase” a commodity from the child. Thus a transfer from child 
to parent will have a pure income effect, and the fraction returned to 
the child will depend on the parent’s marginal propensity to consume 
attention out of lifetime wealth. Since most parents will devote only a 
very small fraction of lifetime resources to the purchase of attention 
from children, it is reasonable to expect that this income effect will be 
small. 

B. Strategic Bequests 

We observe both formal and informal means by which a benefactor 
might commit himself to particular rules regarding the distribution of 
gifts and bequests. At one extreme he could write and make public an 
explicit will. At the other extreme he could make informal promises 
and rely on his reputation for keeping such promises. Yet in each case 
it is possible for him to renege on his commitment: he might rewrite 
his will without publicizing this fact, or he might break his promise 
after the beneficiary has acted as desired. If it were, in fact, optimal 
for him to do this ex post, he would be unable to improve on auto¬ 
matic incentives since rational beneficiaries would anticipate his de¬ 
fection. 

Defection may, however, entail substantial costs: the benefactor 
may incur legal fees or injure his reputation. It benefits from defec¬ 
tion do not exceed these costs, then he can successfully precommit to 
a strategic incentive scheme. 

When is this condition likely to be satisfied? Presumably it is quite 
unlikely that a benefactor can credibly name any arbitrary individual 
as a potential recipient of substantial transfers: the costs of defection 
may be low (he does not care what this individual thinks of him) and 
the benefits high (he greatly prefers to leave his money to someone 
else). However, if he is relatively indifferent about the distribution of 
transfers between certain individuals, he may have substantial scope 
for precommitment. 

Consider two hypothetical examples. First, suppose a parent has a 
single child, whom he loves far more than anyone else. He wishes to 
influence this child’s behavior by threatening disinheritance. For this 
threat to be credible, he must specify an alternative distribution of his 
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estate. Suppose he directs that his entire estate is to be left to some 
randomly selected third party unless the child complies (perhaps he 
simply promises the estate to the third party under the relevant con¬ 
tingency). The parent has an enormous incentive to renege on this 
directive, and the costs of doing so may be quite low (he breaks a 
piomise to a stranger or incurs minimal legal costs). 1 ' If so, the threat 
is empty. 

On the other hand, suppose the parent has two children, both of 
whom he loves more than anyone else. Again he wishes to influence 
the behavior of one child by threatening disinheritance. Suppose he 
directs that his entire estate is to be left to the other child unless the 
first c hikl ((implies. His incentive to renege on ibis directive is far less 
than in the first case (he loves both children). His costs may also be 
small since he may lie more hesitant to break a promise to one of his 
childien. 7 Thus the threat may well be credible. 

To summarize: The costs and benefits of defection determine a set 
of distribution rules to width the benefactor can successfully commit 
himself (he lacks sulfitient incentive to defect). Presumably, each of 
these titles has the property that virtually all transfers are made to 
individuals (or institutions) about whom the benefactor cares very 
ninth. Regarding transfers to these individuals (whom we shall 
heteafter call “credible benefitiaries"), the benefactor may have sub¬ 
stantial st ope fot precommitment. Indeed, it may be credible lor him 
to spet if y any distribution of transfers as long as everything is distrib¬ 
uted within this gioup. 

The identification of the credible beneficiaries for any particular 
benefactor is an empirical issue. This set may lie limited to children or 
may int hide other relatives, friends, employees, or institutions. Fat- 
tors other than personal preference may also come into play: in some 
states, courts protect children's claims on their parents’ estates. For 
tlte time being we will merely assume that this set is identifiable but 
will avoid identifying it. For simplicity we will also assume that it is 
( iedible to specify am distribution of transfers within this group (the 
analysis changes veiy little if the set of credible distribution rules is 
further const rained). 

Formally, we modify the framework adopted in Section Id as fol¬ 
lows. First, we introduce additional potential beneficiaries. Second, we 
assume that the benefactor can commit himself in advance both to 
the total size of his bequest (perhaps bv holding wealth in illiquid 


11 In addition, it mat be very difficult leu an outsider to verity whether oi not the 
relevant contingent v has materialized—the parent has an incentive to misrepresent 
this 

7 In addition, childien can monitor compliance more easily than an outsidei. 
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forms, such as durables and housing) and to a rule governing the di¬ 
vision of this bequest. 

In particular, we assume that the utility of the benefactor is given by 

V/,(cp, a ..a N , Uj .t//v), where a„ is some action taken by the nth 

potential beneficiary and U„ is the utility of this beneficiary. Further, 
V„ is given as some function of a n and c„ (n's consumption), U„(c „, «„). 
For notational convenience we define a = (a ]t ... , a N ). The benefac¬ 
tor makes a transfer, b n , to each beneficiary. He is constrained to 
choose 

b„ ^ 0. 

,v 

( p + X b " < yp- 

>i=l 

The consumption of each beneficiary is then given by c„ = c„ + b„. 

Formally, the game proceeds as follows. First, the beneficiary locks 
in his total bequest (thereby determining his own consumption as a 
residual) and commits himself to a “bequest rule," which governs the 
division of bis estate. This rule specifies, for each profile of actions a, 
that a fraction p„ of total bequest be given to each beneficiary n. We 
represent this rule as a profile of N functions: 

P» = [p'i(a).P)v(a)]. 

We require that the bequest rule satisfy only one condition: for all a, 
iP,','(«) = 1- This restriction reflects two considerations. First, for 
feasibility, the sum of shares cannot exceed unity. Second, the bene¬ 
factor must bequeath anything that he does not consume (in this 
model, he fives for only one period), and, by assumption, be cannot 
bequeath anything to anyone other than his credible beneficiaries. 

Af ter the benefactor has made these choices, potential beneficiaries 
select levels of a„. Finally, the estate is divided according to the 
specifications of the bequest rule. Note that we could consider a mul¬ 
tiperiod version of this model, which has an additional implication 
that parents hold wealth in both bequeathable and the annuitized 
forms. Such an extension is presented in Bernheim, Shleiler, and 
Summers (1984) and produces the same conclusions as the model we 
analyze here. 

We motivate the solution to this game as follows. Consider the set 

S„ — {(o„, f*7i)lC„(c f , “T b„, G„) ^ C n (C n , £!„)}, 

where a„ is the level of a„ that n would choose in the absence of 
transfers. Graphically, S„ corresponds to the set of points above (and 
including) the indifference curve /* in figure 2. Since each beneficiary 
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receives a nonstrategic transfer, he can, at worst, be disinherited. 

Thus, any equilibrium must involve beneficiary n’s consuming an allo- 

consider the following artificial problem for the benefactor, 
max V((f„ a, Ui, < Un) 

subject to 

U„ = u„[a m c n + P„(yp - ‘Cp)]> 

The solution to this problem, denoted («*, 0*, tft), would be appropri¬ 
ate if the benefactor could choose actions for potential benefi¬ 
ciaries subject only to the constraint that each is willing to partici¬ 
pate. Now assume that there is a bequest rule, 0", that, along with eft, 
induces a* as equilibrium choices for the beneficiaries and satisfies 
0"(«*) = (3*. Since the resulting allocation achieves an optimum ignor¬ 
ing incentive constraints, it must necessarily be the benefactor’s best 
choice when these constraints are imposed. In equilibrium, play 
would then evolve as follows: the benefactor would choose eft and 0”, 
the beneficiaries would play «*, and the estate would be divided ac¬ 
cordingly (n’s share would be 0*). 

To characterize equilibrium for this game, we need only exhibit a 
bequest rule, 0", for which a* emerges as an equilibrium in the 
beneficiaries’ subgame (when r t , = eft), and 0"(«*) = 0*. One such rule 
operates as follows. We will refer to a If as w’s “benchmark” action. 
Denote the set of heneficiarieswho take at least their benchmark 
actions by B — {«: a* 3 s a,,}. Let B denote the complement of B. If B is 
nonempty, the beneficiary bequeaths nothing to members of B. In 
constrast, if n is a member of B, then n's share will be 



If B is empty, then the benefactor bequeaths everything to the bene¬ 
ficiary, m, whose action is closest to his benchmark level: n*„, — a„ *= «,f 
— a„ for all n. This bequest rule is intuitively appealing: each 
beneficiary normally receives a positive bequest but is disinherited if 
he fails to meet a standard of "good” behavior. If all children are 
“bad,” the “best” child receives the entire estate. 

This rule defines a simultaneous move subgame where potential 
beneficiaries choose actions a„. It is easy to verify that there are N + 1 
Nash equilibria for this subgame; one consists of every beneficiary 
playing his benchmark level. In the remaining N equilibria, N — 1 
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beneficiaries choose their benchmark levels, while the last one (I) 
chooses Hi- However, for a variety of reasons one can salely ignore 
these less desirable equilibria . 8 

Note that this construction depends integrally on the assumption 
that N 5 s 2. If N = l.thenPi = 1 for all admissible bequest rules. 1 he 
sole potential beneficiary knows that his behavior cannot alfet t his 
inheritance; thus the benefactor is unable to influence his behav ior. 

Stepping outside of our formal structure, one might imagine other 
means by which a benefactor could motivate a sole beneficiary. For 
example, he might precommit to some rule that conditioned his own 
consumption on the beneficiary’s action. This is, however, far more 
difficult than "locking in” a fixed total bequest (by holding wealth in 
illiquid forms) as in our model. First, he cannot credibly commit to 
giving the beneficiary less than his automatic bequest, rf («); to renege, 
he need only break a promise to himself (this involves neither legal 
costs nor a damaged reputation). Thus, his scope for manipulation is 
severely limited. Second, he may have large incentives to renege on 
the promise of rewards above this automatic level since the level of 
his own consumption is at issue (he cannot lock in these incremen¬ 
tal rewards in advance, or they would have no influence on the 
beneficiary’s choice). 

To summarize: When a benefactor can successfully threaten to dis¬ 
inherit a potential beneficiary, he extracts the full surplus generated 
through interaction with the beneficiary. However, success in this 
regard requires him to specify an alternative use for his resources that 
is believable; in particular, he must have more than one potential 
beneficiary to whom he can credibly plan to leave the bulk of his 
estate. 

The analysis presented in Sections I A and I B suggests several possi¬ 
ble ways of distinguishing between this theory and its competitors. 
First, evidence that the behavior of children is influenced by antici¬ 
pated bequests would support the class of models in which bequests 
function as a medium of exchange. Second, evidence suggesting that 
parents successfully wield influence only when they have more than 
one credible beneficiary would support the particular theory of stra- 


B We suggest two reasons: (1) For all € > 0, if the testator sets benchmark levels at a* 
- e rather than a* and otherwise employs the same rule, there is a unique Nash 
equilibrium consisting of all agents meeting their benchmarks. In othet wolds, the 
testator can gel arbitrarily close to his optimum without running into the problem of 
multiplicity. (2) The N undesirable equilibria are not trembling-hand perfect (see Sel- 
len 1975). Consider the potential beneficiary who sets a„ = 0 while expecting his 
competitors to offer their benchmark levels. He cannot be worse oil by playing a„ = a*. 
In addition, if he thinks there is any chance, however small, that another beneficial v 
will make a mistake (tremble), thereby missing his benchmark level, <i* will in that event 
yield strictly higher utility than a„ = 0. 
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tegic influence outlined above. Third, evidence suggesting that par¬ 
ents care directly about some actions taken by their children would 
establish the motive for strategic as opposed to nonstrategic influence 
(as in Becker’s rotten kid theorem). 

II. Econometric Evidence 

In this section we provide empirical support for the hypothesis that 
bequests are used, in part, to influence the behavior of potential 
beneficiaries. Specifically, our examination of microeconomic panel 
data reveals that contact between parents and children is much higher 
in families where tfie elderly parent has a substantial amount of be- 
queathable wealth to oiler. We show that this correlation is robust 
with respect to a variety of specifications and estimation techniques, 
which are designed to rule out alternative explanations based on po¬ 
tentially spurious factors In addition, w'e explore some implications 
of the particular model developed in Section I that differentiate it 
from closely related alternatives and use these implications to test the 
model. The results are extremely favorable to our formulation of 
strategic bequests. 

Bequests c an serve as a means of payment for services only if the 
piesence of bequeathable wealth can influence the behavior of poten¬ 
tial benefic iaries and if testators exercise this influence. We adopt a 
slight abuse of terminology, referring to these two distinct aspects erf 
exchange as the “supply" and “demand” sides. Primarily because of 
the natuie of available data, our basic strategy is to estimate the ef fect 
of bequeathable wealth on the amount of services beneficiaries pro¬ 
vide to testators—the supply side. Although we do not estimate the 
demand side explicitly, we provide indirect statistical evidence for the 
claim that testators exploit the relationship between services and be¬ 
quests. 

The econometric investigation detailed below requires rather 
specific data concerning assets and family interactions for a sample of 
elderlv individuals. The Longitudinal Retirement History Survey 
(LRUS), conducted by the Office of Research and Statistics ot the 
Social Security Administration, collected surprisingly extensive infor¬ 
mation on these characteristics. Data from the 1969, 1971, 1973, and 
1975 waves of the l.RHS were available at the time of this writing; 
unfortunately, insufficient data on assets were collected in 1973, so we 
were forced to drop this year. Over 11,000 individuals aged 58—63 
were included in the first wave. Many of these were lost to attrition; 
on top of this we restricted our sample to married couples who had at 
least one child but no children living at home and for whom sufficient 
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data on nonbequeathabie assets were available. 0 Our final sample con¬ 
sisted of 1,166 observations, 855 of which had two or more living 
children and 311 of which had only one living child. 

Measures of contact between parents and children were con¬ 
structed as follows. For each observation the LRHS contains informa¬ 
tion on total number of children (C,), number of children who visit or 
telephone their parents weekly (VW,), and number of children who 
visit or telephone their parents monthly (VAf,). 10 Our measure of 
attention per child was constructed from these variables as follows: 

,. _ 4 • VW, -I- VM, 

4 • C, 

where V, indicates tontact per child, normalized so that maximum 
contact equals unity. We have adopted the approximation that chil¬ 
dren who visit weekly give their parents four times as much attention 
as those who visit monthly. It is interesting to note in passing that the 
mean of V, was 0.54 in 1969 and rose to 0.63 in 1975—evidently the 
average level of contact is quite high and rises with age. 

Other variables were constructed as follows. Bequeathable wealth 
per child ((>,) includes financial wealth (stocks, bonds, mutual funds, 
bank accounts, checking accounts), residential and other property, 
the face value of life insurance, 11 privately purchased annuities,and 
debt. Nonbequeathabie annuity wealth per child (aw,) includes social 
security and pension wealth. These were obtained by converting data 
on income from those sources to capitalized values applying a dis¬ 
count rate of 1.03 and actuarial survival probabilities. Matching ad¬ 
ministrative records contained data on income earned from 1951 to 
1975 in employment covered by social security up to the taxable max¬ 
imum. This information was extrapolated to yearly earnings using the 
method described in Fox (1976). The resulting income stream was 


** Specifically, we included those who began to receive pensions and social security at 
some point during the sample. Note that out theory predicts that the use ol bequests to 
obtain attention should be more effective when there is only one parent. By t onsidering 
couples we presumably stack the odds against finding evidence of exchange 

10 For some years the sutvey also asked for the number of children who visit or 
telephone daily, in other years, this was simply incorporated into the “at least weekly” 
category. To be consistent over yeats, we added daily contact to weekly contact in years 
for which the former was available. 

11 It is appropriate to include the face value of life insurance since children wish to 
be named as beneficiaries Unfortunately, data on life insurance are quite poor: in pai- 
ticular, it is impossible to determine how much individuals have borrowed against 
their policies. Omitting insurance from our definition of bequeathable wealth has an 
insignificant impact on the estimates presented in this section. 

12 Most privately purchased annuities fail to match the economic definition since they 
have bequeathable components. 
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then accumulated at a 3 percent rate of return to produce a measure 
of lifetime earnings for both husband and wife. Other variables used 
in the following analysis included age of respondent and dummy 
variables indicating whether the respondent’s health is better (BH) or 
worse (WH) than that of other members of his cohort, as well as 
whether the respondent is retired (RET). 

One practical difficulty with these data is that information on the 
behavior of potential beneficiaries is limited to children. For any given 
individual the set of credible beneficiaries may or may not be larger. 
Since our theory suggests that successful exchange takes place only 
when this set contains at least two candidates, we cannot be certain 
that single-child families will behave in the manner predicted here. 
Consequently, we initially restrict attention to families with two or 
more children. Analysis and discussion of behavior in single-child 
families are deferred to the end of this section. 

Another general issue that arises with regard to the use of these 
data concerns the treatment of separate sample years. Except where 
noted, te.sulfs presented in this paper are based on simple pooling of 
the sample years. No correction is made for potential correlation 
between distinct observations on the same household. Such correla¬ 
tion would not, by itself, cause our estimates to be inconsistent; how¬ 
ever, it would imply that standard errors are calculated incorrectly. In 
order to determine the probable magnitude of the resulting error, we 
reestimated a number of our specifications employing the appropri¬ 
ate generalized least squares (GLS) correction. Allhough small 
changes in some point estimates were noted, no qualitative conclu¬ 
sions were altered. More important, estimated standard errors on 
critical coefficients (such as b) differed only slightly from those ob¬ 
tained with simple pooling. 

We begin our analysis by specifying the supply ol attention from 
children as a function of potential bequest per child: 

V, = ($<) + Pi b, -t e„ (2 

where V, and A, are defined above and e, is a random error term 
Within the context of our theoretical model, one can think of equa 
tion (2) as a linear approximation to the implicit function defined lv 
(1), aggregated over beneficiaries. 

Our first step was to estimate equation (2) using ordinary leas 
squares (OLS). 13 Results are presented as equation (i) in table 1. Whil 
the sign of the coefficient on b, is consistent with our theory, on 


15 Throughout we have ignored potential problems arising from truncation of on 
dependent variable. T here is little reason to believe that this biases our results in an 
particular direction. 
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TABLE 1 

Sample: Multiple-Child Families—Pooled Panel (Dependent Variable V) 


Procedure 


OLS 

2SI.S 

2SLS 

2SLS 

2SLS 

(F,q. [i]) 

(F.q. liij) 

(Eq. [ 111 ]) 

(Eq. [tv]) 

(Eq. [v|) 


Constant 

.560 

.531 

088 

.225 

230 

bl 1 0 n 

(.008) 

(.013) 

(.201) 

(.215) 

(.350) 

.333 

2.30 

2.57 

4.58 

8 51 


(.308) 

(.686) 

(.715) 

(1 18) 

(18.4) 

awl 10 B 




- 1 78 

- 1.85 

AGE/100 



722 

( 820) 
.513 

(.867) 

529 

HHI 100 



(.311) 

- 2.55 

(.332) 

-2.96 

(.549) 

-2.41 

WH/IOO 



(1.79) 

- 1 37 

(1.84) 

- 984 

(4.21) 

- 24 2 

rtf//too 



(2 43) 
-2.22 

(2.49) 

- 3 26 

(8.22) 

- 3.67 

b ■ AGE/10 7 



(1.89) 

(1 99) 

(2.03) 

- .756 

b-UHl 10 7 





(2.92) 

- .73) 

b - W7//10 7 
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cannot reject the hypothesis that bequeathable wealth holdings have 
no effect on attention per child. 

There are, however, a variety of reasons for believing that OLS 
estimates of this relationship may be inconsistent. One reason follows 
directly from the structure of our model: explicit consideration of the 
demand side suggests that b, will be determined endogenously. The 
parent’s optimal choice of b, depends in part on the preferences ot his 
children, and e, is an important component of these preferences. 
Thus, as long as the parent has more information about the prefer¬ 
ences of his children than does the econometrician, b, and e, will be 
correlated. The direction of the resulting bias is, however, ambigu¬ 
ous. 

Correlation between b, and c, is likely to be present for other reasons 
as well. Stepping outside the formal model of the last section, one 
particularly plausible story is that some parents get along well with 
their children while others do not. Those tfiat do may hold more 
bequeathable wealth simply because they like their children, while the 
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children in turn may be attentive simply because they like their par¬ 
ents. 

Our solution to this set of problems is to instrument for b, in equa¬ 
tion (2) using the parents’ lifetime earnings v,- We justify this choice of 
instrument as follows. It is clear that lifetime earnings are positively 
correlated with holdings of bequeathable wealth. We must establish 
that, in addition, this instrument is uncorrelated with e,. For our first 
story, v'/ may be correlated with e, if parents work harder when young, 
so that they have more wealth with which to influence their children 
when old. For our second story, this correlation may be nonzero if the 
elderly parents whose children particularly like them have been par- 
tit ulatlv hardworking (or la/yj. 1 1 Although one could, in both cases, 
plausiblv argue that the correlation is nonzero, it is difficult to believe 
that n is very large. 

Two-stage least squares (2SLS) estimates ol equation (2) are pre¬ 
sented as equation (n) of table 1. Nolice that the coefficient on b, is 
approximately eight times as large as the corresponding OLS estimate 
and that the hypothesis of no effect on attention can be rejected at 
extiemely high levels of confidence. This regression confirms our 
pi edict inn that, in multiple-child families, bequeathable wealth will be 
stiongly ton elated with attention. 

I he apparently striking difference between OLS and 2SLS esti¬ 
mates can be tested lormallv. A llausman (1978) test reveals that 
exogeneity of b, tan be rejected at a high level of confidence. Ibis 
conclusion is consistent with our model (in which b, and a, are simulta¬ 
neously determined) and constitutes limited evidence in favor of an 
ojjetalive demand side. One should, ol course, bear in mind that this 
rejet lion of exogeneity is also consistent with other alternatives. 
Nevertheless, it is worth noting that the particular alternative outlined 
above (correlation between filial and paternal altruism) implies that 
OLS estimates of the coefficient on b, should be biased upward. In 
fact, we observe the opposite. 

While our theoretical model offers one explanation for the set of 
lesulis described above, the observed correlation between attention 
and bequeathable wealth could also be attributed to a number of 
spurious factors. We now turn to the task of ruling out these alterna¬ 
tive explanations. 

One might object that our basic specification omits a number of 
important variables with which both attention and bequeathable 
wealth are highly correlated. For example, healthy parents may be 


11 The daemon ol this correlation is not clear II a parent likes Ins child, he may 
work hauler to provide more physical goods or work less to spend more lime with his 
child 
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more pleasant to visit (or, conversely, less needy of attention) as well 
as more successful in the marketplace. Older parents belong to a 
poorer cohort and in general require more care. Retired parents may 
have a greater desire for contact with children. We correct for these 
difficulties by adding a vector of parental characteristics, Z„ to our 
basic specification: 

V, = + (3|fe, + Z,y + e,. (3) 

In particular, Z, includes age, health dummies ( BH„ WH,), and a 
retirement dummy (RET,). Results are presented as equation (iii) in 
table 1. The inclusion of these additional variables appears to have 
very little impact on either the magnitude or statistical significance of 
the coefficient on b,. 

Another apparently compelling objection is that wealth may affect 
attention through a variety of spurious channels. For example, par¬ 
ents with higher wealth may simply pay for traveling expenses, tele¬ 
phone calls, and so forth in order to have more contact with their 
children. Wealth effects may also be less direct. In particular, there is 
presumably a positive correlation between the incomes of parents and 
those of their children. A wealthy child may be more difficult to 
influence or more desirable to visit. Wealthy children may be more 
capable of defraying the costs of travel and telephones but may also, 
on average, live farther from their parents. Thus the direction of the 
potential bias is not obvious. 

Note, however, that these alternative explanations do not distin¬ 
guish between bequeathable and nonbequeathable wealth (social se¬ 
curity and pension annuity), as does our theory. A parent’s ability to 
defray the costs of contact is determined both by his ordinary wealth 
and by his claims on annuities. Similarly, while it is true that the 
wealth of children is correlated with parental resources, it is not likely 
to be highly correlated with the division of parental resources be¬ 
tween bequeathable and nonbequeathable forms. Thus, in order to 
determine the magnitude of spurious wealth effects, we add annuity 
wealth (aw,) to our basic specification: 

V, = Po + M, + p2««'. + Z,y + e,. (4) 

The ef fect of holding another dollar of wealth in bequeathable form 
is then given by the difference between the coefficients on b, and aw, 

(Pt - P 2 )- lr> 

Estimates of specification (4) are presented as equation (iv) in table 


" Equation (4) is equivalent to V, = (3„ + (Pi — ft*)*, + PuW, + Z,y + e„ where W, is 
the total wealth of the ith individual, captures spurious wealth effects, and Pi - (T is 
the independent effect of holding wealth in a bequeathable form. 
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1. Note that the coefficient on aw, (the spurious wealth effect) is nega¬ 
tive and statistically significant, while the coefficient on b, is positive 
and highly significant. The effect of holding wealth in bequeathable 
rather than annuitized form, given by the difference between these 
coefficients, is estimated to be 6.36, with a standard error of 1.69. 
Thus, correcting for spurious wealth effects only strengthens our 
original conclusion. 

Another possible solution to the problem of spurious wealth eff ects 
is to restrict attention to a subsample for which these effects are likely 
to be unimportant. If the source of contamination concerns ability to 
pay, then sue h effects may be minimized by considering a subsample 
for which financial costs of contact are negligible. Presumably, geo¬ 
graphic proximity eliminates much of these costs. Fortunately, the 
l.RUS contains relevant information. Accordingly, we reestimated 
equation (4) for two subgroups: parents whose children all live within 
the same city or neighborhood and parents whose children all live 
within 150 miles. The parameter estimates (omitted) 14 ’ were quite 
close to those obtained for the entire sample. In fact, the effect of 
bequeathable wealth on attention appeared to be largest for parents 
living in closest proximity to their children. 

A related objection concerns the inclusion of housing wealth in our 
measute of A,. It has been suggested to us that a positive coefficient on 
b, may simply reflect the fact that children prefer to visit parents who 
live in nice houses. To accommodate this objection, we reestimated 
equation (4), substituting bequeathable noohousing wealth (bub,) for 
b,. Despite the fact that most elderly individuals hold a large fraction 
of their portfolios in residential housing, the estimates (omitted) were 
very dose to those presented in table 1; in fact, the estimated be¬ 
queathable wealth effect was slightly larger. On the basis of this evi¬ 
dence, we are inclined to reject the hypothesis that our results are 
simply an artifact of some special feature of housing wealth. 

As a final check on the robustness of our results, we reestimated 
equation (4) separately f or each of our sample years. The coefficients 
of interest (those on b , and aw,) were extremely stable over the sample 
period (estimates are omitted). 

So far, our empirical analysis has been solely concerned with estab¬ 
lishing a link between attention and bequeathable wealth and with 
ruling out alternative explanations based on potentially spurious fac¬ 
tors. We now explore some other implications of the particular model 
developed in Section 1 that differentiate it from closely related alter¬ 
natives, and we use these implications to lest the model. 

16 All the omitted estimates are contained in our working paper (Bernheim et al. 
1984) 
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First, a number of the variables included in Z, should affect the 
“price” at which attention can be purchased, as well as the absolute 
amount of attention supplied by children. Consider, for example, the 
variable WH, (worse health). Although sick parents may receive more 
attention simply because of filial devotion, in a more cynical view, 
illness increases the probability of death, thereby making a potential 
bequest of fixed magnitude more valuable to the child. To differ¬ 
entiate between these effects, we reestimated equation (4), adding 
interactions between b, and IV//,, BH„ and AGE,.' 1 The results, pre¬ 
sented as equation (v) of table 1, are quite striking. Only three 
coefficients are statistically significant: those on aw,, WH,, and WH, • b,. 
The coefficient on aw, changes very little from our original estimate. 
The coefficient on WH, is negative, indicating that, aside front ex- 
change-motivated concerns, sick parents receive less attention. In 
contrast, the coefficient on WH, • b, is large and positive. This strongly 
suggests that, for multiple-child families, rich parents (where “rich” is 
strictly defined in terms of bequeathable assets rather than total re¬ 
sources) who are in poor health receive much more attention than 
their indigent counterparts. Once again, the data suggest significant 
financial motivation. 

A second strong implication of our particular theory is that ex¬ 
change-motivated holding of bequeathable wealth can influence the 
behavior of potential beneficiaries only if there are at least two cred¬ 
ible candidates. Unfortunately, as mentioned above, there is no way to 
determine the number of such candidates for any particular respon¬ 
dent in the I.RHS. However, logically speaking, our theory admits the 
possibility that children are, in some meaningful sense, the only cred¬ 
ible beneficiaries for the bulk of a parent's estate. This hypothesis can 
be tested empirically by investigating behavior in single-child families 
and comparing it with our multiple-child results. We must emphasi/.e 
that this hypothesis is not a consequence of our theory; thus failure to 
differentiate between behavior in single- and multiple-child families 
would not recommend rejection of our theory. However, the absence 
of a positive correlation between attention and bequeathable wealth in 
single-child families would strongly support our theory, as well as the 
supplemental hypothesis that parents cannot credibly threaten to dis¬ 
inherit all of their children. 

These considerations motivated us to reestiinate each specification 
above using data on single-child families. Our results are presented in 
table 2. Note that in equations (i)—(iv) the pattern of signs on the 
coefficients on b, and aw, is precisely the opposite of that obtained for 

17 For the 2SLS regressions we included interactions between lifetime income and 
WH„ BH,, and AGE, in the instrument list. 
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muliiple-i hiltl families. In addition, the standard errors of coef- 
fit it*ms on key parameieis are relatively small. It is worth noting that 
the inefficient on b, in ihese regressions is quite close to the mag¬ 
nitude ol' the spurious wealth effect estimated for multiple-child fam¬ 
ilies. lK This is what one would expect, since b, is no longer “con¬ 
taminated” by strategic considerations. The only troubling aspect of 
these estimates is that there appears to be a statistically significant 
difference between the coefficients on b , and ate,; presumably, au\ 
should carry only the spurious wealth effect as well. Strictly speaking, 
this is inconsistent with our model. Note finally that in equation (v), 
worse health continues to have a negative impact on attention (al¬ 
though the magnitude is not statistically significant); however, there is 
no evidence that this can be compensated for by high bequeathable 
wealth holdings, as in multiple-child families. This evidence strongly 
supports the hypothesis that strategic bequests take place only in 

"* Thai is, u equals the coefficient on annuity wealth in the equations presented in 
table I 
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families with at least two children; thus children are usually the only 
credible beneficiaries. It is difficult to reconcile this conclusion with 
any known model of bequests other than that presented in Section I. 

One possible explanation for our findings is that the results on the 
multiple-child family are an artifact caused by estimating across 
families of different sizes. As a check against this possibility, we rees¬ 
timated equation (4) separately for two- and three-child families. Re¬ 
sults are presented in table 3. For both groups, key parameter esti¬ 
mates are very close to those obtained for the original sample. 19 

A further remark on the difference between single- and multiple- 
child families is in order. Just as it is difficult to see how this difference 
could be reconciled with any other known theory of bequests, it is also 
difficult to see why any explanation of our multiple-child results 
based on potentially spurious factors would not apply equally well to 
single-child families. Thus, our results refute any alternative explana¬ 
tion that fails to account for the single/multiple child distinction. We 
believe that this makes the empirical case for our theory compelling. 

Taken as a whole, the preceding estimates are extremely favorable 
to our model. It is therefore important to emphasize that our results 
are extremely robust and that these estimates are representative of 
other regressions we ran but did not include in this paper. Aside from 
some problems with selecting a proper subsample (e.g., one prelimi¬ 
nary sample inadvertently included observations of which children 
lived at home, making interpretation of visits and telephone calls 
difficult), our procedures produced favorable results on the first try, 
and subsequent modifications did not alter any substantive conclu¬ 
sions. Full disclosure requires that we report three apparent “fail¬ 
ures.” First, OLS estimates of all but the simplest specification (eq. [i], 
table 1) yielded negative coefficients on b,. This is not surprising in the 
light of our arguments concerning the endogeneity ol b,\ in fact, 
we submit that the discrepancy between OLS and 2SLS estimates 
strengthens the case for an operative demand side. Second, attempts 
to estimate a fixed-effects version of the model produced nonsensical 
coefficients with large standard errors. However, since no sensible 
instrument is available for fixed-effects estimation (there is only one 
observation on lifetime earnings for each respondent), we were not 
troubled by this finding. Finally, estimates based on an alternative 
measure of attention (letters received from children) were much less 
striking. Although the pattern of coefficients was consistent with our 
theory (the coefficient on b, was greater than the coefficient on aw, for 
multiple-child families, and vice versa for single-child families), alter- 


19 This should not be surprising. Our instrumental variable is not deRated by the 
number of children in the family. 



io68 


JOURNAL OF POLITICAL ECONOMY 


TABLE 3 
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native hypotheses could not he rejected with any reasonable level of 
confidence. On reflection we decided that the letters variable was not 
a very satisfactory proxy for attention since parents who weie fre¬ 
quently visited presumably received few letters. 

III. Other Evidence 

The preceding econometric analysis of the L.RHS data favors the view 
that strategic exchange plays an important role in bequest behavior. 
By and large the predictions of our model are confirmed. At least 
some of these predictions are not implications of alternative models of 
bequest behavior. Beyond this evidence there are a number of other 
aspects of individual behavior that are more easily reconciled by our 
model of strategic bequests than with alternative formulations. 

There are at least three alternative formulations to the present 
model of bequest behavior that have been widely studied. These are 
the “accidental bequests,” "bequests for their own sake,” and “al¬ 
truistic bequests” models. The first, recently urged by Davies (1981), 
suggests that consumers do not have bequest motives and that be¬ 
quests arise only as a consequence of uncertainty about the date of 
death in conjunction with annuity market imperfections. A second 
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model, used by Blinder (1974) and many others, assumes that con¬ 
sumers' lifetime utility depends in part on the size of their bequest. In 
this view bequests are a form of terminal consumption. A final possi¬ 
bility is the "altruistic” view of bequests put forth by Barro (1974) and 
Becker (1974, 1981). In this view parents maximize a utility function 
in which the utility of their children also enters, but they engage in no 
strategic behavior. 

Each of these formulations is inconsistent with the empirical obser¬ 
vation that consumers are reluctant to participate in annuity-type 
arrangements even on quite favorable terms. Moreover, the second 
and third formulations cannot account for the apparent insignifi¬ 
cance of gifts. We first review the available evidence, then indicate 
why it contradicts the three standard models of bequest behavior, and 
finally describe why such behavior is consistent with our model. 

Privately purchased annuities are a rarity in the American econ¬ 
omy. The LRHS revealed that such annuities rarely represented 
more than a very small fraction of wealth and, in most cases, were not 
purchased at all. Of course, this may well be so because adverse selec¬ 
tion complicates the working of this market. 20 Perhaps more persua¬ 
sive evidence comes from the lack of market response to “reverse 
annuity mortgages.” These instruments allow individuals to annuitize 
their home equity. Even where they are offered on relatively favor¬ 
able terms, they do not appear to be well received. 21 A similar conclu¬ 
sion is suggested by the lack of response to a California state program 
that allowed property owners to defer property taxes until after their 
death on a subsidized basis (see Urban Systems Research and En¬ 
gineering 1983). 

Perhaps the strongest evidence of consumer resistance to annuities 
comes from an examination of the choices made by retirees under the 
TIAA-CREF program. This group is mainly composed of educators 
who are presumably better informed than most pension recipients. 
Retirees are offered several options, including full annuities and "n- 
year certain” plans. 22 A 10-year certain, for example, guarantees that 
a retiree and his heirs will receive at least 10 years' worth of benefits, 
even if the retiree dies sooner. 2 '' A 1973 study reported that over 70 
percent of beneficiaries chose plans other than those providing full 


Though one would expect the adverse selection to he much more serious in the 
relatively well-functioning market for life insurance. Warshawsks (1983) presents evi¬ 
dence that loads on annuities are comparable to loads on lile insurance 

21 For a survey of the evidence on litis topic see Urban Systems Research and En¬ 
gineering (1983). 

22 Annuity amounts are set so that the plans are, in principle, equivalent on an 
actuarial basis (see TIAA-CREF 1973). 

25 In each case, provision is made for surviving spouses 
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annuity protection. This suggests a desire to make allowances for 
bequests. 

This evidence suggests that there is no strong latent demand on 
the part of aged Americans for annuity protection, and it is clearly in¬ 
consistent with the accidental bequest model. In this view individuals 
should purchase annuity protection even if it is very unfair actuarially 
since bequests are not valued at all. In particular, the choice of years 
certain annuity protection directly contradicts the accidental bequests 
model. 

less obviously, the reluctance of consumers to take advantage of 
actuarially fair or subsidized annuities is inconsistent with the be¬ 
quests lor their own sake and altruistic models of bequests. It is well 
known (see, e.g., Sheshinski and Weiss 1981; Bernheim 1984«) that 
under such formulations consumers who have access to actuarially 
fair annuity markets will perfectly insure, financing consumption en¬ 
tirely out of annuity income. An underannuitized individual will 
finance consumption partly out of bequeathable wealth, while an 
overanmnti/ed individual will save some fraction of his annuity in¬ 
come, thereby building an estate. Thus it an individual consumes 
some portion of either the principal or income from his bequeathable 
wealth, we infer that he is underannuitized and should take advan¬ 
tage of at tuariailv fair opportunities to purchase annuities. 

Time are two reasons to believe that individuals hold bequeathable 
wealth iu pan to finance their own personal consumption. First, de¬ 
spite the earlier findings of Brittain (1978) and Mirer (1979), more 
recent studies by Diamond and Hausman (1982), King and Dicks- 
Mireaux (1982), and Bernheim (1984(c) suggest that retirees do dis¬ 
save from bequeathable wealth. Second, if bequeathable wealth is 
held only for the purpose of making intergenerational transfers, then 
these transfers would be made as gifts rather than as bequests at 
death. F.arly transfer confers two advantages: it allows beneficiaries to 
annuitize the optimal fraction of transferred resources immediately, 
and it may ease liquidity constraints encountered by beneficiaries 
early in the life cycle.* 1 

To summarize; Behavioral evidence suggests that individuals hold 
bequeathable wealth in part to finance personal consumption. Under 
either the bequests for their own sake or altruistic models, this implies 
that such individuals are underannuitized and should take advan¬ 
tage of actuarially fair opportunities to insure. Yet this prediction is 
counter! actual. 


Noie also that the failure of parents to transfer their homes to their children is 
inconsistent with the Roilikott-Spivak (1981) view that families serve to provide private 
annuity insurance 
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The reluctance of very wealthy individuals to convert bequests into 
intra vivos gifts poses a further puzzle for these alternative theories. 
Despite the existence of significant tax advantages to transferring 
resources during lifetimes, many wealthy individuals who can antici¬ 
pate leaving large bequests with virtual certainty do not make 
significant intra vivos gifts. This observation has disturbed some pro¬ 
ponents of dynastic altruism who recognize that an important impli¬ 
cation of this model is that families will conduct their affairs to mini¬ 
mize total tax liability. While some (notably Adams 1978) have 
defended dynastic altruism by arguing that, contrary to Shoup 
(1966), Cooper (1979), and Menchik (1980), tax-minimizing transfers 
are in fact observed, we find this claim implausible. 2 ’ 

The strategic bequest model described in Section 1 does not share 
these counterfactual implications concerning the acceptance of an¬ 
nuities and the use of gifts. By making all intentional transfers at 
once, the parent attenuates his ability to influence his children in 
subsequent periods (King Lear’s well-known blunder). Furthermore, 
it is quite likely that it is easier to influence children by promising 
bequests as opposed to gifts. Few families are so mercenary as to 
countenance explicit quid pro quo contracts; thus the lure of gifts tends 
to be more speculative than a claim on a known estate, and vague 
promises of contemporaneous rewards are subject to equivocation by 
parents who would prefer to retain resources ex post. 

A common finding in empirical analyses of bequests (Sussman et al. 
1970; Brittain 1978; Menchik 1980) 2fi is that, in most cases, parents 
give equal amounts to each of their offspring. In part, this conclusion 
may arise from focusing primarily on cash rather than on the more 
difficult-to-value tangible bequests. The model here makes no predic¬ 
tion that bequests should be equal across children, except by coinci¬ 
dence or if beneficiaries are identical. Indeed this observation alone 
cannot refute the hypothesis that bequests are used to influence the 
behavior of beneficiaries since, in equilibrium, threats are never car¬ 
ried out. At best, it establishes that, for reasons not captured in our 
model, parents do not manipulate their children “optimally." Equal 
bequests pose an equal or greater problem for the altruistic model, 
which issues the clear prediction that bequests should be used to 


25 Adams (1978) overstates the burden ot the capital gains tax by neglecting the fact 
that the beneficiary can defer realizing any assets with capital gains and can use a 
variety of other devices to shelter them. Nor does his analysis explain the failure of 
most families to set up nonreverting trusts that allow assets and. in some cases, capital 
income as well to escape tax almost entirely. Last, Adams's analysis cannot explain why 
assets without capital gains, or even with capital losses, also appear to be transferred as 
gifts infrequently. 

,,ih Disputed, howevers by Tomes (1981). 



1072 JOURNAL OF POLITICAL ECONOMY 

equate as closely as possible the utilities of various offspring. 27 The 
other two models of bequests do not have any clear implications for 
this issue. 

So far we have been content to infer motives indirectly from behav¬ 
ioral observations. Studies by Sussman et al. (1970) and Horioka 
(1983) offer much more direct evidence on the nature of bequest 
motives. Both studies confirm the significance of exchange-motivated 
bequests. 

Sussman et al. conducted a painstaking study of close to 1,000 
estates selected from Cleveland probate court. They document the 
use of bequests as a means of payment by finding a significant effect 
of intrafamily exchange on deviations from equal divisions of be¬ 
quests. In case after case, “reciprocity was expressed through the 
distribution to particular children for services rendered to parents [so 
that) children who took care of their elders . . . received the largest 
share of the parent’s properly or the only share if the estate was very 
small” (p. 290). Disinheritance was usually a side effect of rewarding a 
specific child for care given in old age (p. 103), although some parents 
specifically disinherited children who ignored them. 

ft is important to emphasize that both testators and beneficiaries 
clearly perceived and consciously exploited opportunities for ex¬ 
change involving bequests. Testators frequently left most of their 
estates to spouses in part so that the spouses would “have a legacy to 
use in bargaining for services from children and others later on” 
(p. 290). Likewise, “children feel that they should maintain intimate 
contact with aged parents in order to provide them with emotional 
support and social and recreational opportunities, and that such con¬ 
tact maintenance is requisite for obtaining a share of the inheritance” 
(p. 1 19). When interviewed, children “generally accept the notion that 
the sibling who has rendered the greatest amount of service to the 
aged parent should receive a major portion of the inheritance" 
(p. 118) and usually prefer that bequests be divided according to the 
principle of reciprocity (p. 148). 

Horioka (1983) reproduces the results of a survey of attitudes of 
the elderly in Japan toward the distribution of their assets among 
their children. Of the respondents 35.1 percent indicated that they 
would “give more to the child or children who did more for me." 
This, however, should be thought of as a lower bound on the 
significance of exchange-motivated bequests. The traditional pattern 
in Japanese families is for the eldest son to move in with and care for 
his elderly parents until their deaths, at which time he receives the 
entire estate. Thus the 43.2 percent of respondents who indicated 

,i7 Assuming that they enter the parent's utility function symmetrically. 
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that they would “give all to the eldest son” may have simply an¬ 
nounced their equilibrium choices, having already received coopera¬ 
tion from that child. It is worth noting that only 12.1 percent said that 
they would “divide equally between one's children," while only 4.3 
percent were inclined to “give to the child who is ill or physically weak 
or who has no income-earning power." Thus, neither egalitarian nor 
altruistic motives appear to be particularly prevalent. 

IV. Macroeconomic Implications 

In the preceding section we developed the implications of our model 
for several aspects of individual behavior and contrasted these with 
predictions based on alternative models. This section focuses on the 
macroeconomic implications of the strategic bequest motive. 

Our paper provides an example of an environment in which par¬ 
ents and children are linked by voluntary utility-maximizing intergen- 
erational transfers and in which parents care directly about welfare of 
their children. The Ricardian equivalence theorem and related prop¬ 
ositions are nevertheless false in general. The implications of our 
formulation for issues such as the effects of social security and gov¬ 
ernment indebtedness on capital formation correspond very closely to 
the implications of standard life-cycle models. 2 ” This model captures 
an intuitively plausible aspect of the world that the altruism model 
does not. Parents would prefer to receive a gift to having their chil¬ 
dren receive an equal gift, even when they care about the utility of 
their children and make transfers to them. 

Several reasons for preferring the current model to the “dynastic 
altruism” formulation of llarro (1974) were discussed in the preced¬ 
ing sections. We are unaware of any direct microeconomic evidence 
favoring the notion of altruistic bequests. Until such evidence is pro¬ 
vided, economists should be cautious about justifying the analytical 
use of infinite-lived consumers by appealing to dynastic altruism. 

The model developed here suggests a number of potentially impor¬ 
tant interactions between demographic and economic phenomena. 
By conditioning bequests on behavior, parents may successfully in¬ 
fluence decisions by their children concerning education, migration, 
and marriage. The desire to purchase services from children, coupled 
with the need to have at least two credible beneficiaries, may also 
affect fertility. This could, for example, account for Park’s (1983) 
observation that Korean households have a strong preference for two 
male children and could strengthen theories of the so-called demo- 

'■ iH Barro (1974, p. 1106, n. 14) himself notes that the Ricardian equivalence theorem 
would not hold if exchange played a large role in motivating bequests. 
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graphic transition based on parental desire for care during old age . 
The model also suggests that various exogenous demographic trends 
will have specific economic effects. Declining population growth 
means more single-child families and therefore less incentive to save 
to purchase attention. For similar reasons, rising life expectancies, 
longer retirement periods, and increasing geographic mobility may 
all affect the national savings rate. These and related issues are dis¬ 
cussed in Bemheim (1984c). 

The model also suggests that international variations in savings 
itites mav lie related to differences in family structure, as well as to 
legal institutions governing the distribution of estates. For example, 
Horioka's evidence indicates that exchange motivates the division of 
bequests in mam Japanese households. In addition to the survey of 
at,nudes discussed in Section III . he documents that over 80 percent 
of elderly Japanese live with their children, compared with approxi¬ 
mately 10 percent for the United States. This may help to account tor 
Japan’s high rate of saving. In contrast, certain European countries, 
such as Sweden (see Bloinquist 1979), reeptire testators to divide the 
hulk of their estates evenly between their children. This restriction 
neutralises the mechanism outlined in Sec tion I and removes a strong 
mcenttve tor accumulating bequeathable wealth. 

Oui analysis also suggests a subtle but possibly important side effect 
of the growth of social security and the spread of annuitized private 
pensions. The model here provides a partial explanation for consum¬ 
ers' reluctance ro purchase annuities even at relatively attractive rates; 
.inirtiities deny consumers the opportunity to purchase care and at¬ 
tention from their children (although much of the actual aversion to 
annuities is undoubtedly based on ignorance and confusion). If social 
security or pensions foist more annuity protection on consumers than 
they wish, a collateral consequence will be that consumers are able to 
purchase less attention than they would prefer. A general decline in 
attentiveness of children to parents is widely alleged to have taken 
place since the introduction of social security (see, e.g., Friedman 
1980). 1'he significance of the effect stressed here is of course difficult 
to gauge.’”’ 

I his research could usefully be extended in a number of directions. 

It would be valuable to explore models in which more elaborate inter¬ 
actions between children were possible. Empirically, the insights sug¬ 
gested by this model could he used to inform econometric analyses of 
the consumption and portfolio choices of the aged. In addition, it 
might be useful to use simulation techniques to examine the relation 

*'■’The model also implies that social security offsets private savings by less than one 
for one 
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between bequests of the type modeled here and the level of capital 
formation. It is unlikely that any of these extensions would cast doubt 
on our conclusion that the strategic motive is central to the economic 
analysis of bequests. 


References 

Adams. James D. "Equalization of True Gift and Estate Tax Rates.”/. Public 
Peon. 9 (February 1978): 59—71. 

-. “Personal Wealth Transfers." Q.J.E. 95 (August 1980): 159-79. 

Barro, Robert J. “Are Government Bonds Net Wealth?” J P.E. 82 (Novem¬ 
ber/December 1974): 1095-1117. 

Becker, Gary S. "A Theory of Social Interactions.” J.P.E. 82 (November/ 
December 1974): 1063-93. 

-. A Treatise on the Family Cambridge, Mass.: Harvard Umv. Press, 1981. 

Ben-Porath, Yoram. “The F-Connection: Families, Friends and Firms, and 
the Organization of Exchange." Report no. 29/78. Jetusalein: Hebrew 
Umv., Inst. Advanced Studies, 1978. 

Bernheim, B. Douglas. "Annuities, Bequests, and Private Wealth," Mimeo¬ 
graphed. Stanford, Calif.: Stanford Umv., 1984 (a) 

-. “Dissaving after Retirement: Testing the Pure Life Cvcle Hypoth¬ 
esis." Mimeographed. Stanford, Calif.: Stanford Umv., 1984. (b) 

-. "Intrafamily Conflict and Its Resolution: Implications (or Demo- 

giaphic Choice." Mimeographed. Stanford, Calif.: Stanford Umv., 1984. 

(0 

Bernheim, B. Douglas; Shleifer, Andrei; and Summers, Lawrence H. “Be¬ 
quests as a Means of Payment.” Working Paper no. 1303. Cambridge, 
Mass.; N.R.E.R., 1984 

Blinder, Alan S. Toward an Economic Theory of Income Disinflation. Cambridge, 
Mass.: MI L Press, 1974. 

Blomquist. N. Sdren. “The Inheritance Function."/. Public Econ. 12 (August 
1979): 41-00. 

Brittain, John A. hihentance and the Inequality of Material Wealth. Washington: 
Brookings Inst., 1978. 

Cooper, George. A Voluntary Tax? New Perspectives on Sophisticated Estate Tax 
Avoidance. Washington: Brookings Inst., 1979. 

Davies, James B. "Uncertain Lifetime, Consumption, and Dissaving in Retire- 
mcni." J P.E. 89 (June 1981): 561-77. 

Diamond. Peter A., and Hausinan, Jerry A. “Individual Retirement and Sav¬ 
ings Behavior." Mimeographed. Cambridge: Massachusetts Inst. Tech., 
1982. 

Fox, Alan. “Alternative Measures of Earnings Replacement for Social Secu¬ 
rity Benefits.” In Reaching Retirement Age: Findings from a Survey of Newly 
Entitled Workers, 1968—70. Research Report no. 47. Office ol Research and 
Statistics, Social Security Administration. Washington: Government Print¬ 
ing Office, 197b. 

Friedman, Milton. “Discussion." In The American Economy m Transition, edited 
by Martin Feldstein. Chicago: LIniv. Chicago Press (for N.B.F..R.), 1980. 

Hausinan, Jerry A. “Specification Tests in Econometrics." Econometrica 4b 
(November 1978): 1251-71. 



JOURNAL OF POLITICAL ECONOMY 


1076 

Honoka, Charles Y. “The Applicability of the L.ife Cycle Model of Savings to 
Japan.” Mimeographed. Kyoto: Kyoto Univ. 1983. 

King, Mervyn A., and Dicks-Mireaux, L.-D. L. “Asset Holdings and the Life- 
Cycle.” Earn. J. 92 (June 1982): 247-67. 

Kothkofl, Laurence [ , and Spivak, Avia. "The Family as an Incomplete An¬ 
nuities Market " J.P.E. 89 (April 1981): 372-91. 

Kotlikoff, Laurence J., and Summers, Lawrence H "The Role of Intergener- 
ational Transfers in Aggregate Capital Accumulation." J P.E. 89 (August 
1981): 706-32. 

Mirer, Iliad W. "The Wealth Age Relation among the Aged." AMR. 69 
(June 1979): 433-43. 

Metichik, Raul L. “Primogeniture, Equal Sharing,and the U S. Distribution ol 
Wealth." QJ E 94 (March 1980): 299-316. 

Park, Chai Bin, “Preferences for Sons, Family Si/.e, and Sex Ratio: Ari Empir¬ 
ical Study in Korea." Demography 20 (August 1983): 333—52. 

Selten, Remhard. “Reexamination of the Perfectness Concept for Equilib¬ 
rium Points 111 Extensive Gaines." Inlet mil. ] ('•time Theory 4, no. 1 (1973): 
25—55 

Sheshmski, Eytan, and Wetss, Yoram “Uncertainty and Optimal Social Seni¬ 
lity Systems." Q.J.E. 96 (May 1981). 189-206. 

Shortocks, A F. "The Age-Wealth Relationship: A Cross-Section and Cohort 
Analysis." Rev Earn, and Statu. 57 (May 1975): 155—63. 

Slump, Carl S. Perietal Estate and Gift Taxes Washington: Brookings Inst , 
1966. 

Sussman, M. B.; Cates, J. N.; and Smith. O. 1 . Inhertlatue and the Eamii\ New 
York Sage, 1970. 

1IAA-CREP. A Survey of lienrfmarirs. Washington: TIAA-CREF, 1973. 

I ohm, James. “Life Cycle Saving and Balanced Gtowth." In Ten Eiomnnu 
Studies in the T 1 adit ion of Irving hi slier, by William Fellnei et al New York’ 
Wiley, 1967. 

Tomes, Nigel. " I he Family, Inheritance, and the Intergenerational Trans¬ 
mission of Inequality." / P.E. 89 (October 1981): 928-58. 

Uiban Systems Research and Engineering. ,4 Summary of Remit Reseat th on 
Inflation and the Elderly Mimeographed. Cambridge, Mass.: Urban Systems 
Res. and Engmeei mg, 1983. 

Warshawsky, Mark. “Uncertain Lifetime Demand for Private Annuities, and 
Provision for Bequests in Retirement." Mimeographed. Cambridge, Mass.: 
Haivaid Univ., December 1983. 



Heterogeneity, Aggregation, and Market 
Wage Functions: An Empirical Model of 
Self-Selection in the Labor Market 


James J. Heckman 

University of Chicago 


Guilherme Sedlacek 

Camrgtf-Mrllon l nwerut\ 


This paper presents an empirical equilibrium model ol self-selection 
in the labor market that recognizes the existence oi measured and 
unmeasured heterogeneous skills. We derive a model ol the sectoral 
allocation of workers of different demographic types and present a 
new econometric procedure for combining micro and macro data to 
estimate supply and demand functions lor unmeasured sector- 
specific productive attributes. Our model extends previous empirical 
work on wage equations by inlioducing determinants ol aggregate 
market demand and supply into an explicit, economically inlerptet- 
able estimating equation. These extensions are required to produce 
a model that fits the distribution of wages for the U.S. labor mai ket. 


This research was suppoitcd by NSF grants DAR 79-25924, SES-8107963. and SES 
84-11242 to the Quantitative Methods Centei at National Opinion Research Center 
I lie first draft (entitled "1 he Impact of die Minimum Wage on the Employment and 
Earnings ol Workeis in South Carolina”) was prepared in December 1980 I hat drab 
and subsequent drafts have been circulated since 198) in Heckman's course. Labor 
Economics 442. A fourth draft (entitled "An Equilibrium Model of the Industrial 
Distribution ot Workers and Wages") was presented to the summer meetings of the 
Econometric Society in Stanford, California, June 1984. Bo Honore, Ricardo Banos. 
|oe 1 lot/, Robert Michael, Sherwin Rosen, and Jos6 Scheinkinan made valuable com¬ 
ments on this paper as did participants in seminars at Chicago, Columbia. Concordia 
(Montieal), Kentucky, and Penn We are especially grateful lo Ricardo Barros for 
valuable comments. We also would like lo thank two anonymous relerees. We thank 
Vicky Lollgawa (or valuable editorial assistance 

[Journal ctf Political Ccanom\, 1985, vol. 93, nci (i| 

<£> 1985 b> The University of Chicago All rights reserved OO22-38O8/Ho/9JIOfi-0()l2$Ol 50 

* 


J<>77 



JOURNAL OF POLITICAL ECONOMY 


1 07 H 

Diversity in the amount and type of skills possessed by workers is a 
central feature of modern labor markets. Yet econometric analysis of 
aggregate labor market data either ignores such diversity entirely 
(e.g., Sargent 1978; Geary and Kennan 1982) or assumes homoge¬ 
neous skills for workers classified by such criteria as age, race, educa¬ 
tion, and sex (e.g., Hamermesh and Grant 1979; GolJop and Jorgen¬ 
son 1983; Jorgenson 1985). While the second approach to labor 
aggregation improves on the first by recognizing worker diversity, it 
si ill ignotes plausible heterogeneity in skills within the available crude 
demographic categories. Moreover, it is not obvious that demo¬ 
graphic categories define economically meaningful skill categories. 

Welch (1969) recognizes the diversity of skills within crude demo¬ 
graphic-education groups and uses the Lancaster (1966) and Gorman 
(1980) characteristics model 10 postulate that labor incomes are the 
sum of the incomes earned on distinct measured and unmeasured 
attributes owned by each weaker with a umjorm price per attribute 
,ic mss all market .sectors. I 11 his model, workers are indif ferent among 
st-ttot s of the economy (i.e.. there is no scope for comparative advan¬ 
tage) because identical firms are able to repackage worker skill bun¬ 
dles costlessly. 

I lec kmati and Sc heinktnan (1982), building on suggestions by Man¬ 
delbrot (1962), clctive conditions nuclei which prices for measured 
and unmeasured attributes are uniform across all market sectors. 
They present empirical evidence that rejects this description of the 
laboi market and hence the Welch approach for U.S. data. Their 
evidence suggests that the pursuit of comparative advantage is an 
important feature of l i.S. labor market data (see also Sattinger 1980). 

This (taper presents an empirical equilibrium model of compara¬ 
tive advantage or sell-selection in the labor market that recognizes the 
existence of measured and unmeasured heterogeneous skills within 
even narrowly defined demographic groups. The points of departure 
for out work are the seminal Roy (1951) model of income distribution 
and later applications of the Roy model by Rosen (1978) and Willis 
and Rosen (1979). We derive a model of the sectoral allocation of 
wotkers of different demographic types. We also present a new 
econometric procedure for combining micro and macro data to esti¬ 
mate supply and demand functions for unmeasured sector-specific 
productive attributes. 

Our methodology extends previous statistical work on self-selection 
to an explicit market setting in which the prices of attributes respond 
to c hanges in the determinants of aggregate demand and supply. Our 
model extends previous empirical work on wage equations by in¬ 
troducing determinants of aggregate market demand and supply into 
explicit, economically interpretabie estimating equations. We extend 
Roy’s model of self-selection by embedding it in a market setting and 
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by (a) introducing a nonmarket sector, ( b) allowing workers to select 
their sector of employment on the basis of utility maximization rather 
than income maximization, and (r) permitting unmeasured attributes 
to be nonlognormally distributed. These extensions are required to 
produce a model that fits the distribution of wages for the U.S. labor 
market. 

This study presents evidence that supports the commonly utilized 
practice of aggregating manufacturing into a single sector for the 
purpose of estimating labor demand functions. However, a new ag¬ 
gregate is required that recognizes both measured and unmeasured 
heterogeneity in skills in the population and accounts for self¬ 
selection decisions by agents. 

We use our model to estimate the importance of aggregation bias in 
measured aggregate real wage rales. Aggregation bias reduces mea¬ 
sured wage variability in manufacturing below what it would be if the 
quality of the manufacturing work force were held constant. How¬ 
ever, for the economy as a whole, precisely the opposite effect occurs. 
Aggregation bias causes measured aggregate wage variability to over¬ 
state quality constant wage variability. Because of comparative advan¬ 
tage, workers who move from one sector to another in response to a 
macro disturbance lower the average quality of the work force in the 
sector to which they go and raise the average quality in the sector 
from which they depart. This phenomenon accentuates measured 
wage variability over what it would be if sectoral labor force quality 
were held constant. 

This paper is in four sections. Section I presents a rigorous state¬ 
ment of Roy’s model of self-selection and embeds it in a market set¬ 
ting. We present a new method for combining micro and macro data 
to estimate the demand and supply of unmeasured sector-specific 
productive attributes. Section II extends Roy’s model. Our extended 
model nests Roy’s as a special case and so is convenient for economet¬ 
ric testing. Unlike the Roy model, the proposed model can generate 
Pareto-like right tails that are claimed to be an essential feature of 
income and wage distributions by Mandelbrot (1962), Lydall (1968), 
and others. Section III reports empirical estimates and tests of the 
new model. We estimate the contribution of self-selection to income 
inequality and present empirical evidence on the importance of 
aggregation bias in measured aggregate real wage movements. The 
paper concludes with a brief summary (Sec. IV). 

I. An Estimable General Equilibrium Roy Model 

A. The Model 

We begin the analysis by expositing the point of departure for our 
own work: the Roy model of self-selection for workers with heteroge- 
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neous skills. Following Roy, we assume that there are two market 
sectors in which income-maximizing agents can work. Agents are free 
to enter the sector that gives them the highest income. However, they 
can work in only one sector at a time. 

Each sector requires a unique sector-specific task. Each agent is 
endowed with a /-dimensional skill vector s that enables him to per¬ 
form sector-specific tasks. Vector s is continuously distributed with 
density g(s]0), where 0 is a vector of parameters. The model is short 
run in that aggregate skill distributions are assumed to be given. 
There are no costs of changing sectors, and investment is ignored. 
Because of this assumption, the mode! presented here applies to envi¬ 
ronments with certain or uncertain prices for sector-specific tasks. For 
simplicity and without any loss of generality, we assume an environ¬ 
ment of perfect certainty. We leave the development of a more dy¬ 
namic model with investment and mobility costs tor another occasion. 

Let /,(s) be a nonnegative function that expresses the amount of 
sector i specific task a worker with skill endowment s can perform. 
This function is technologically determined. However, it may shift 
over time as technology changes. I'he task functions are assumed to 
be continuously differentiable in s. The distinction between tasks as 
objects of firm demand and skills as endowments of workers captures 
the idea that packages of skills cannot be unbundled and that differ¬ 
ent skills are used iti different tasks. 1 

The output of sector i, denoted F„ is assumed to depend on the sum 
of individual sector-specific tasks employed in the sector and not on 
its distribution. Denoting A, as a vector of nonlabor inputs, the aggre¬ 
gate production function for sector i is assumed to be of the form 

Y, = F°\T„ A,). / = 1,2, 


where T, is the total amount of task i employed in sector i. Function F {l) 
is assumed to be twice continuously differentiable and strictly concave 
in all of its arguments, with positive inputs required for positive out¬ 
puts. 

For fixed output price /*,, the equilibrium price of task i in sector i, 
denoted it,, is the value of the marginal product of a unit of the task 


■n, = P, 


i)F 0) 

<rr, 


7 =1. 2. 


(I) 


An agent with endowment s works in sector i if his income is higher 
there, that is, 


1 t his specification is sufficiently general that it permits the same skills to be equally 
productive in generating all tasks. Thus some of the skills may have the economic 
character of general human capital. 
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•n.tjfs) 2 TTjt/s), i ¥= ], i,j = 1, 2. (2) 

Indifference between sectors is a negligible probability event since the 
t,, i = 1,2, are assumed to be continuous nondegenerate random 
variables. 2 Throughout we assume that prices are positive (rr, > 0). 

Inequality (2) defines a set of s values, not necessarily connected, in 
which agents with values of s in the set earn their highest income by 
working in sector i. This set is defined as 

•T, = {s: -7T,/,(s) > ity/s), i 5* /}. 

If the t, are linear functions of s, (2) partitions the domain of s into 
two connected sets. For this specification of the l,(s) functions (and 
others as well) there is a market stratification of workers into tasks by 
their s type. Demographic groups differing in their distribution of 
skill endowment tend to specialize in different sectors. There may be 
“black” or “teenage” jobs, not because those demographic categories 
are of direct interest to employers, but because members of those 
groups possess skill endowments of special use in a particular sector. 

The log wage in sector i of an individual with endowment s is 

In te,(s) = In it, + In /,(s). (3) 

Assuming that the function mapping skills to tasks does not change 
over time, (3) implies that log wage functions (expressed as functions 
of s) have identical coefficients in successive cross sections except for 
their intercepts. This implication is termed the “proportionality hy¬ 
pothesis” in this paper. 3 Specification (3) rationalizes the “paradox¬ 
ical” result that the rate of return to schooling (the coefficient of 
schooling in a log wage equation that is linear in schooling) has not 
changed over time despite expansion in the aggregate stock of school¬ 
ing. In wage function (3) an exogenous increase in the supply of 
schooling affects only the intercept of the log wage equation. 

Wage equation (3) is not a conventional hedonic function. In the 
hedonic models of Tinbergen (1951, 1956) and Rosen (1974), an 
implicit market prices out each component of s. In the model of (3) 
the t, are priced out, not (directly) the components of s. Hedonic wage 
equations fit in separate market sectors could be interpreted as re¬ 
vealing “prices” for the attributes in each sector, but there would be 
no economic content in such an interpretation. In a single cross sec¬ 
tion of data, wage equation (3) is empirically indistinguishable from a 
conventional hedonic wage equation. 


1 More precisely, (f,, ( 2 ) is a nondegenerate, continuously distributed random vector. 
A belter name would be “additivity hypothesis," but that term has special meaning 
in the theory of consumer demand. Clearly wage functions in levels are proportionately 
related across time if the hypothesis is correct. 

« 
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[he proportion of .he population working in sector, is the proper - 
lion of (lie population whose skill endowments he iri J,. 

pr(0 = g(s!&)ds. 

[ lie aggregate supply of task to sector /, 7’,, is obtained by integrating 
the micro supply over 

7", = j[ f/s)g(s|©)rfs, / = 1,2. 

Both T, and pr(/) are iioinogencous of degree zero functions of rs and 
are monotone increasing (strictly nondecreasing) functions of 7r, and 
monotone decreasing (strictly nonincreasing) functions of it, (i ^ j). 

An equilibrium exists for a given vector A, of nonlabor inputs if, 
when the T, are inserted in (1), there are prices ff that exactly evoke 
supply 7’,, i — 1,2. Under standard conditions on the technology it is 
possible to establish that an equilibrium exists in the labor market. 

Without further restrictions, the Roy model produces no inter¬ 
esting refutable empirical hypotheses. 1 To produce such hypotheses it 
is necessary to postulate specific functional forms. 

Roy assumes that the density of skills g(s|0) and the task functions 
t,(s) are suc h that (In t\. In b>) is normally distributed with mean (pc ( , n?) 
and covariance matrix X. Letting (u,. u-_>) be a mean zero normal 
vector, agents in the Roy model choose between two possible wages: 

In re| = In it, 4 gi 4- u | 
or 


111 U'a = 111 IT>> + P2 + U >- 


Workers enter sector 1 if In xr, > In nr>. Otherwise they enter sector 2. 
Letting cr* = Vvarpe [ — u^) and c, = [ln(iT,/ 7 r ; ) + p, - p ; ]/cr*, 1 ^ j. 


pr(t) = P(ln it', > In v >,) = <t>(r,), t 7 s j, /, j = 1 , 2 , 


where <f>( ) is the cumulative distribution function of a standard 

normal variable When standard sample selection bias formulae are 
used (see, e.g.. Heckman 197G, 1979), the mean of log wages observed 
in sector i is 


f.(ln te,|ln w , > In tc,) = In n, + p, + 



i,j = 1 , 2 , / 5 *j, 


(4) 


1 I he proportionality hypothesis is an implication of the assumption of the existence 
oi sector-specific efhoenry units that underlies wage specihcation (3) and not 
specifically die Roy model. Foi further discussion of the empirical content of the Roy 
model, see Heckman and Singer (1985) 
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Me) = 


,— exp( - 'he 2 ) 
_ 


is a convex monotone decreasing function of c with \(c) > 0, 
lim X(c) = 0, iim X(c) = x. 

r— r—+ — 9c 


Convexity is proved in Heckman and Sedlacek (1986). 

The variance of log wages observed in sector i is 

var(ln w,|ln w, > In re,) = cr„{pf[l - c,X(c,) - X 2 (c,)] 

+ (1 - P?)}. 


(5) 


where p, = correl(w„ u, - u t ), i 1 ,)= 1,2. The variance of the log 
of observed wages never exceeds <r,„ the population variance, because 
the term in braces in (5) is never greater than unity. In general, 
sectoral variances decrease with increased selection. For example, if pi 
and p 2 do not equal zero, as tti increases with in, held fixed so that 
people shift from sector 2 to sector 1, the variance in the log of wages 
in sector 1 increases while the variance in the log of wages in sector 2 
decreases. 

Using the fact that w, - it ,t„ 


£(ln tjln w 1 > In w 2 ) = 1X1+ - 2JJL X(c 1 ), (4a)' 

cr* 


E(ln i 2 |ln w <2 > In uq) = p -2 + ——— <T| "- X(c 2 ). (4b)' 

cr* 

Focusing on (4a)' and noting that X is positive for all values of fi 
(except ([ = 3 =), we see that the mean oflog task 1 used in sector 1 
exceeds, equals, or falls short of the population mean endowment of 
log task 1 as o-[ 1 — cr l2 is greater than, equal to, or less than zero. If 
endowments of tasks are uncorrelated (cr 1 2 = 0), self-selection always 
causes the mean of In <1 employed in sector 1 to be above the popula¬ 
tion mean p.|. The opposite case occurs when cr,, - cr [2 is negative. 
This case can arise only when values of in 6 and In t 2 are sufficiently 
positively correlated. If this occurs, the mean of log task 1 used in 
sector 1 falls below the population mean p.|. Since covariance matrices 
must be positive semidefinite, crj i + <r 22 — 2cr 12 2 : 0. Thus if 1 — a t2 
< 0, cr 22 — <j 12 > 0 so the mean of log task 2 employed in sector 2 
necessarily lies above the population mean p 2 - I n the Roy model the 
unusual case can arise in at most one sector. Notice from (5) that only 
if U|, — cr 12 = 0 (so p* = 0) is the variance of log task 1 employed in 
sector 1 identical to the variance of log task 1 in the population. 
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Otherwise, the sectoral variance of observed log task 1 is less than the 
population variance of log task I. 

To gain f urther insight into the effect of self-selection on the distri¬ 
bution of earnings for workers in sector 1, it is helpful to draw on 
some results from normal regression theory. The regression equation 
for in L> conditional on In t\ is 

In = p 2 + ■ — li? - (In t) — pi) + e 2 , (6) 

o I i 

where T(e<j) = Oand var(e 2 ) = a a2 [] ~ (cr^/cri | 0 21 >)]. 

Figure 1 plots regression function (6) for the case a !2 = a, , and p 2 

> Pi > 0. For each value of In < t , the population values of In I 2 are 
normally distributed around the regression line. Individuals with 
high values of In t\ also tend to have a high value of In t u . Assuming tt t 
= n 2 , individuals with (In t,, In t 2 ) endowments above the 45-degree 
line of equal income shown in figure 1 choose to work in sector 2, 
while tiiose individuals with endowments below this line work in sec¬ 
tor 1. Because 0 | 2 = an, the regression function is parallel to the line 
of equal income. 

The distribution of e 2 about the regression line is the same lor all 
values of In When individuals are classified on the basis of their 
In /1 values, the same proportion of individuals work in sector 1 at all 
values of In For this reason the distribution of In /, employed in 
sector 1 is the same as the latent population distribution. If tt, is raised 
(or -rr 2 is lowered) so that the 45-degree equal income line is shifted 
upward, the same proportion of people enter sector 1 at each value of 

f i- 

Figure 2 plots regression function (6) for the case o- !2 > O; i and p 2 

> Pi > 0. As before we set tti = tt 2 . Individuals with endowments 
above the 45-degree line choose to work in sector 2, while those with 
endowments below this line work in sector 1. When individuals are 
classified on the basis of their f) values, the fraction of people working 
in sector 1 decreasrs the higher the value of fj. Self-selection causes the 
mean of log task 1 employed in sector 1 to be less than the mean of log 
task 1 in the total population. People with high values of l, are under- 
represented in sector 1 and low t, values are overrepresented. In the 
extreme, when In /| and In < 2 are perfectly correlated, all high-income 
individuals are in sector 2, while all the low-income individuals are in 
sector 1. The highest-paid sector 1 worker earns the same as the 
lowest-paid sector 2 worker. 

If -tri is raised (or rr 2 is lowered) so that the line of equal income is 
shifted upward, the mean of In t, employed in sector 1 must rise. The 
only place left to get t, is from the high end of the C, distribution. 
Unlike the case of cr 12 = an, in which a 10 percent increase in tti 
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In t 2 



results in a 10 percent increase in measured average earnings in sec¬ 
tor 1, when W |2 > ct, ) a 10 percent increase in -it, results in a greater 
than 10 percent increase in the measured average earnings in sector 1 
as the average quality of the sector 1 work force increases. 1'he vari¬ 
ance of log wages in sector 1 increases. 

If it, | <o, 2 , then cr ,2 < 022 in order for 1 to be a covariance matrix. 
In the population, log task 2 must have greater variability than log 
task 1. Individuals with high /, values tend to have high t > values. But 
the population distribution of log task 2 has more mass in the tails. 
The higher an agent's value of 1 ,, the more likely it is that he will be 
able to get higher income in sector 2. At the lower end of the distribu¬ 
tion, the process works in reverse: lower t\ individuals on average 
have poor t-> values. Self-selection causes the In q distribution in sector 
1 to have an evacuated right tail, an exaggerated left tail, and a lower 
mean than the population mean of In / 

If CT 12 < o’,, (a case not depicted graphically), the proportion of each 
<1 group working in sector 1 increases, the higher the value of t\. I he 
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In t. 



mean ot tlit* log task employed in sector 1 exceeds (x,. A 10 percent 
iiu cease m tt , produces an increase of less than 10 percent in the 
average earnings of workers in sector 1 as the mean of In /| employed 
m sector 1 declines. In fact if cT| 2 > cr^y it is possible for an increase in 
it| to cause measured sector 1 wages to decline. 


11 Estimating the Model 

We next propose a method for consistently estimating (a) the parame¬ 
ters of the distribution of tasks including the parameters of the func¬ 
tions relating skills to tasks and {b) the parameters of the sectoral 
demand functions for unmeasured tasks. 

W'e assume access to the following commonly available data: (i) 
time-series data on the aggregate amount of compensation paid to 
workers in each sector; (ii) microeconomic repeated cross-section data 
on the wages of workers by sector and their associated demographic 
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and productivity characteristics; and (iii) time-series data on sectoral 
determinants of the demand for tasks. The most challenging aspect of 
our problem is that quantities of sector-specific tasks and their associ¬ 
ated prices are not directly measured. 

Assume that the functional form relating skills to tasks is In t, = c,s, 
i = 1, 2. Vector s is decomposed into measured and unmeasured 
components, s„ and s„. Their associated coefficients in c, are c,„ and c,„. 
Assuming that (n) s„ is distributed independently of s„ and ( b ) E( c,„s„) 
= 0 defines an operational task function. Let p, = c,„, c,„s„ = u„ and 
s„ = x so that the task function may be written as 

In t, = p,x + f = 1,2 (7) 

and log real wages are 

In w, = In tt, + In t, = In tt, 4- P,x + u,, i = 1, 2. (8) 

Unless <r„ - cr, ; = 0, least-squares estimators of the parameters of 
equation (8) fit on sector 1 wage data are inconsistent because of selec¬ 
tion bias. Empirical evidence of self-selection supports the model. It is 
necessary to control for selection bias in order to perform a proper 
test of the proportionality hypothesis. This hypothesis states that the 
slope coefficients of selectivity-corrected real wage equation (8) 
should be the same in all cross sections, but the intercept may vary if 
task prices change. 

The intercept of real wage equation (8) combines two parameters: 
(a) the log of the real price of task 1 , In -it,, and ( b ) the intercept of the 
task function, denoted p 0l . Assuming a time-invariant distribution of 
unobservable u,, sample selection bias corrected regressions of log 
wages on x consistently estimate In tt, up to a constant (Po,) from the 
intercept of the wage equation. Conventional methods are available to 
estimate consistently the slope coefficients of p, and 2 = var(«i, 112 ).'’ 

Estimating sample selection bias corrected versions of (8) for each 
sector for each cross section generates a time series on In tt, + p 0l . To 
obtain the quantities of log task employed in each sector in each 
period, subtract the estimated intercept from the log real wage bill in 
sector t, WB, (the total labor compensation paid out in the sector 
denominated in constant dollars). This produces an estimate of labor 
aggregate In T, up to an unknown additive constant (Po,). This labor 
aggregate is not a Divisia labor index. That index is constructed as¬ 
suming homogeneous skills for measured demographic categories. 
Our index of labor skills recognizes that skills may be diverse within 
even narrowly defined demographic groups, that demographic 


5 See Heckman (1976) and Heckman and Sedlacek (1981, 1986), It is possible to 
estimate £ and (fj, 1 . j^. 2 ) with no regressors in the model. 
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groups are not necessarily economically meaningful skill groups, and 
that self-selection determines the supply of skills in the market. 6 

Let / denote a year subscript. Assuming that the aggregate derived 
demand for tasks is loglinear in aggregate tasks and real task prices, 
we write 


In T lf = $ 0l 


+ 



+ 



+ e,,, 


/ = 1 ./. 


(9) 


where is a realization of a mean zero stationary stochastic process 
that shocks production technology (1), P^is a vector of real prices for 
other inputs, and P,i is the real price of output of sector i at time I. 
Economic theory predicts that 5j, should be negative. 

Setting ir,i, i — 1,2, equal to one in a benchmark year defines the 
units of tasks T, t . Using the definition of the real wage bill 7',/tt,/ 
= WJi,,, we may write equation (9) as 



+ 8 ->, 


(W&n + l)j + (&i, + l)(h» it,/ - lit P,,) 
!»(-—-) + f. h i = 1 . 2 , / = 1 ./., 


( 10 ) 


where In ft,/ is the inletcept estimated from the microeconomic log 
wage equation fit in sector i in year l (eq. [8]) and e,i diffeis from e,, by 
the estimation error of In it,/ for In ir,/; e,/= c,/ + (8 W + l)(p„, + In tt,/ 
- In Because it is plausible that aggregate shocks (tv) determine 
deflated product price (P,/) as well as tt,/, least squares does not in 
general consistently estimate the parameters of (10). 

Potential instrumental variables for In ir,/ and P,/ include the deter¬ 
minants of the aggregate skill distribution such as government policy 
variables affecting labor supply. 7 The fact that the In -it,/ are estimated 
from cross-section data does not cieate any econometric problem pro¬ 
vided that in each cross section the u, are distributed independently of 
each other, the number of cross-section observations used to estimate 
In tt,/ becomes large relative to the number of time-series observations, 
and the numbers of both types of observations are assumed to become 
large. Using standard instrumental variables methods, it is possible to 
estimate consistently the parameters of demand equation (10). 

The model can be extended to let the population mean of the task 


” For a discussion ot Divtsia indices of labor aggregates see (lollop and Jorgenson 
(1983). 

7 Jorgenson, Lau, and Stoker (1982) use such instruments in estimating general 
equilibrium models. 

* In our case eacli cross section has around 3,200 observations whereas the time series 
has only 14 observations. 
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function shift over time. Defining 3, w as the intercept of task i func¬ 
tion in year /, we write (J„,, = m,(l) + ty.,, where m,(l) is a function of 
observed characteristics (e.g., polynomials in time) and t\,, is a mean 
zero stationary stochastic process assumed to be distributed indepen¬ 
dent of m,(l). Noting that In it,/ includes fJrn/ and substituting for p 0 , in 
(10), we reach 

= S 0 , + (S,, + 1 )(ln tt- In P,,) - (5,, + l)m,(/) 

+ S-zi hi( p' j + f e,i - (S,, + 1 hi,/], (H) 

I = 1.2,/ = 1. L, 

where c,/ is redefined using »«,(/) in place of f) 0 , in the previous expres¬ 
sion. Provided that m,(l) is a low-dimensional function of /, instrumen¬ 
tal variable methods still consistently estimate the parameters of (11). 
However, if the technology mapping skills to tasks is subject to distinct 
year-specific shocks, so m,(/) is a polynomial of degree L, there are no 
degrees of freedom in the time series and none of the parameters of 
the demand functions can be identified. 

C. Com ludmg Remarks on the Roy Model 

If only because the manufacturing sector as a whole has been the 
focus of so many empirical studies of the demand for labor, a natural 
starting point for our empirical analysis divides the economy into 
manufacturing and nonmanufacturing sectors. By dividing the data 
in this fashion, we can lest for the existence of our proposed labor 
aggregate in either sector. 

For the model to be empirically acceptable, it is required that (a) 
demand functions be downward sloping (8|, <<),/= 1,2) and that ( h) 
the proportionality hypothesis of the temporal stability of the wage 
equation (except for intercepts) not be rejected. In addition, since a 
normality assumption for (u i, M 2 ) is not innocuous and sample selec¬ 
tion bias corrections based on misspecified distributions produce 
biased estimates, y we require that fitted wage distributions accord with 
actual wage distributions in the sense of producing an acceptable x 2 
goodness-of-fit statistic. 9 10 


9 However, ihey can still be used to lest consistently tor sample selection (see Heck¬ 
man 1980). Goldberger (1983) and Heckman and MaCurdy (1985) discuss nonnormal 
models. 

10 A fourth test of the model examines evidence of sample selection bias (a nonzero 
coefficient on k[r,] for 1 = 1,2) in the wage equation in at least one ol the two sectors 
This test does not generalize to the model presented in the next section and so is not 
discussed further. 
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When the Roy model is fit on Current Population Survey earnings 
data disaggregated into manufacturing and nonmanufacturing sec¬ 
tors. it is rejected by these test criteria. The proportionality hypothesis 
is rejected and a goodness-of-fit test strongly rejects the underlying 
distributional assumptions. ( These estimates and their failure to ac¬ 
count lor the observed income distribution are discussed in Heckman 
and Sedlacek [198b].) 

There are a number of possible responses to this rejection of the 
model. One possible reason for rejection is the highly aggregative 
nature ol the manufacturing and nonmanufacturing sectors. By dis¬ 
aggregating the data into smaller, more economically well-defined 
sectors, we may he able to produce a model that survives our test 
critena. The practical difficulty that arises in pursuing this avenue of 
investigation is that general muhisiate discrete data models are com¬ 
putationally very expensive to fit. 

An alternative response to our rejection of the Roy model that is 
pm sued in the rest ol this paper preserves the two-market-sector split 
and genet ali/.es the basic Roy model. 

II. An Extended Roy Model 

A. Thr Mo(M 

We extend the Roy model by («) assuming that workers maximize 
utility and not just money income in making their sectoral choice 
decisions," (b) decomposing earnings into hourly wage rates and 
houis ol wotk and assuming that the latter are freely chosen, (cj 
developing a general nonnonnal model for unmeasured tasks (W|, u 2 ) 
that nests Roy’s model as a special case, and (ri) incorporating a non- 
market or household production sector as an alternative to market 
activity. All four extensions are required to produce a two-market- 
sector model of hourly wage rates that fits data from the IJ.S. labor 
mat ket and survives the lest criteria presented in Section I. We focus 
on explaining wage rates in our empirical analysis leaving the empir¬ 
ical analysis of hours of work and earnings for another occasion. 

In place of task function (7), which maps skills to tasks, we utilize a 
mote general Box-Cox model 

——- = p,x + M„ / = 1, 2. (12) 

Random variable u, is equated to an underlying mean zero normal 


1 ‘ Let' (197H) was the fust to make this extension in a model without a nonmarket 
sectoi, 
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random variable u* for values of that variable that produce positive 
values of t„ that is, u, = u* if 

1 + \,(P,x + it*) > 0. (13) 

Imposing a nonnegativity restriction on the admissible t, avoids both 
mathematical and economic absurdities. 12 The joint density for (<,, < 2 ) 
is presented in Appendix A. A convenient feature of our statistical 
model is that when X, = 0 equation (12) specializes to Roy’s model (7), 
which always satisfies inequality (13) so u, is normally distributed. By 
estimating X, we can determine whether or not the lognormal Roy 
model fits the data. 

In our more general model, self-selection (with either income- 
maximizing or utility-maximizing selection rules) does not necessarily 
decrease the variance of In /, over what it would be in nonselected 
populations as is the case in the Roy model. In Heckman and Sedlacek 
(1986) we demonstrate that the more negative are the values of X, and 
the more negatively correlated are the latent normal random vari¬ 
ables (it*, ii*), the more likely it is that selection increases the variance 
of In /, and In w, for workers employed in sector i. In our more general 
model, self-selection can increase inequality (measured by the vari¬ 
ance of logs) both within and between sectors over what it would be in 
the absence of self-selection, whereas in the Roy model selection must 
decrease within-sector inequality. 

Our model can produce a Pareto tail for wage rates or tasks 
whereas the tails in the Roy model are thinner than Pareto tails. A 
Paretian tailed density g(fi) has a tail such that 

lim g(l |) ~ c/f ", a > 1, c > 0. 

/i -* * 


Using the expression for the density of fi, given in equation (A2) 
of Appendix A and assuming X| =0, we get 


lim 

h —* * 


/(<i) 

g(*\) 


0 


so that the lognormal has a thinner tail than a Pareto density. 1 ' 1 For Xj 
< 0, our model has a Paretian tail in the sense that for each value of a 
it is possible to select X! = 1 — a so that the density of has the same 
tail behavior as the selected member of the Pareto family. Our pro- 


12 Poirier (1978) and Amemiya and Powell (1981) have noted the importance of this 
restriction in applying the Box-Cox model 

13 The same is true tor a censored lognormal density where the censoring is due to 
self-selection decisions by agents (see Heckman and Sedlacek 1986) There we establish 
that the tail behavior ot the censored normal and censored Box-Cox models is the same 
as the tail behavior of tht uncensored models. 
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posed model can capture a feature of income and wage distributions 
claimed to be empirically important by Mandelbrot (1962) and Lydall 
(1968), whereas Roy’s lognormal model cannot. 

We extend Roy’s model by including nonmarket participation as an 
option available to agents and by assuming that agents are utility 
maximizers rather than simple money income maximizers. The utility 
of participating in each sector is assumed to depend on sector-specific 
attributes such as wage rates, sector-specific consumption attributes 
(e.g., employment risk and job status), and the utility value of options 
that accrue to agents who participate in the sector (e.g., entitlement 
effects for social programs conditioned on sectoral participation as 
discussed in Mortensen [1977]). Letting V, denote the utility of partici¬ 
pating in sector i, where i = 3 designates the nonmarket sector, an 
agent chooses to participate in sector > if utility is maximized by doing 
so, that is, 

V, > V r i * j, i,j = 1, 2, 3. (14) 

Let z, denote a vector of measured sector-specihc consumption at¬ 
tributes and household characteristics variables. Array all the z„ skill 
characteristics x, and log task prices In it, into vector f. Solving out for 
wages as a function of x and In 7t, in the utility functions, we reach a 
reduced-form lineali/ed index function 

In V', - -y,f + ( = 1, 2, 3. (15) 

We assume that f is distributed independently of all the u, and that 
(vi, v>, u ( ) is a mean zero multivariate normal random variable 

(v,, v-2, u : i) ~ jY(0 , X„), (16) 

This specification produces the Thurstone multivariate probit model 
analyzed by Bock and Jones 11968) and Domencich and McFadden 
(1975).“ 


* 1 A mure explicit derivation of (15) and ( l(i) from classical consume! choice theory 
adopts a ioghnear specification lor the mixed tinea and indirect utility functions: 

In V, = ijj 0 , + ifi,, in u>, + + u>„ i *= 1, 2, (*) 

In V* = ih n + <**) 

and assumes that o» = (tui, ui*. — iV(0, and that oi is distributed independently of 
z, for all i. Labor supply decisions within each sector are assumed to be optimally 
determined by agents By permitting the coefficients in each sector to assume separate 
values for variables that are common to all z, vectors, we recognize that sectors may 
differ in their consumption and investment possibilities. Substituting for u>, = tt, t, in (*) 
using (12) and assuming that X, is approximately but not exactly zero produces 

in V, - (ijio, + 4>i/ In tt,) + 1,0.x + + (4ij,u, + u>,), 

where + to, = v„ i = l, 2, is approximately normally distributed because (13) is 
satisfied for all uf in the neighborhood of X, ~ 0 for i = l, 2, and u, and w, are assumed 
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Since only sectoral choices and not the V, are directly measured, it is 
possible to identify only parameters of the contrasts of utility evalua¬ 
tions among sectors. Without any loss of generality we normalize 
= 1 so 7-1 = 0 and u : ,i = 0. Using this convention, sector i is chosen if 

In V, — In Vj > 0, i.e., ( y , - y : )f + v, — v, > 0 for ally ¥= l 1 ’ (17) 

In Heckman and Sedlacek (198b) we prove that in a model with at 
least one nondegenerate regressor in f, it is possible to identify yi, y->, 
var(-uo), and cov(v ( , u.) from data on observed sectoral choices pro¬ 
vided that var(U|) is normalized to unity. 1 '’ 


B. The Statistical Model 


The statistical model used to generate the empirical estimates re¬ 
ported in this paper is presented in Appendix B. It joins reduced- 
form sectoral choice equations (17) with wage equations produced by 
the Box-Cox model (12), where the wage is w, = tt,1, so 


(zr,/ir,)*" — 1 

K 


P,x + m„ ; = 1,2. 


(18) 


We adopt a reduced-form approach to estimation because all the 
determinants of market wages are plausible determinants of utility in 
their own right. No restrictions are imposed between the parameters 
of (18) and the parameters of sectoral choice equations (17). F.stimat- 
ing an unrestricted sectoral choice model yields an upper bound on 
the goodness of fit of a more restricted explicitly structural sectoral 
choice model. 

Provided that prices (rr,) are normalized to unity in a year and that 
there is stability over time in some parameters on the right-hand side 
of (18) (i.e., in elements of P, or the variance of «,), it is possible from 
successive cross sections of data to estimate task prices it t = 1,2,/ 
= I, . .., L. The selected normalization defines the units in which task 
prices are measured. In addition, if 0, it is possible to estimate 
year effects (shifts in the intercept) in wage equation (18). Denote 
these year effects by p 0 ,/, i = 1,2,/= 1 


to be independent of x, z„ and In it,. Set = u>q. The assumption that the X, are dose to 
zero appears to be consistent with the data. In Sec. Ill we note that the estimated X, for 
manufacturing is 0.08 (X- 2 ) while the value for nonmanufacluring is -0 06 (X,) 

Indif ference occurs on a sel of measure zeio by virtue of the normality assumption 
tor I;*, u s ) assuming that these random variables are nondegenerate 

H> One such normalization is required. The probabilities of the events described by 
inequalities (17) are unchanged if the inequalities are all divided by the standard devia¬ 
tion Of V\. 

17 More precisely and gi the notation for time-series task prices and year effects in 
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Because wage equation (18) contains year effects (the In tv and f3 (w ) 
and wages are an argument of the utility function, and to allow for 
year-specific shocks to preferences, year effects are introduced into 
the reduced-form sectoral choice functions (15). These year effects 
are denoted by 7 a,/, i = 1 , 2 ,/= 1 . L. 

Using the likelihood presented in Appendix B it is possible in a 
single cross section of data to estimate consistently the A.,, {J,, and y,, 
i = 1,2, as well as the variances of the latent normal variables uf 
(which generate u,), their covariance within andit 2 , and the covariance 
structure of (u,, is,) (setting the variance of v, to one or adopting some 
other normalization). The covariance between n* and u* is not 
identified. IM From repeated cross-section data, it is possible to estimate 
consistently the ju ices tt,/ and (he year effects (3 0 ,/> y IM , i — 1,2,/ = 
1, . . , L, given a conventional normalization (suppressing intercepts 

or setting one value of these parameters to a known constant). The 


wage functions itmoduccd in the text, if X, — 0 for / — 1,2, eq. (18) specializes to 
(making .in olmoiis change of nnration) 

In «',/ -- In TT,f P,/x, + u,j, i = 1.2./= I, . , Z., 

so that lit 7 t,/ is indistinguishable limn the intercept term Assuming that 3 C »,^ is 
constant m successive noss sec lions and letting tt,, *- 1 in one veai, it is possible to 
eMimate ,\ lime senes of task prices from seleciion-t onecied wage lunilions. II X, 0, 
tt,i ni«*is the model as a stale paiatnetei Assuming that tt,/ > 0, (18) may he rewritten 
as 






+ 


n.Mi.i. 


<*) 


is here ill,/ -- (tt,/)*’ for / =■ 1 , 2 and / « 1 , . , With no icstric irons over time in the 

wiiuhi n ol u,t or the slopes oi the intercepts ol the wage Iuncnon, it is not possible to 
estimate a lime sei »es of task piues it./ from sclec uon-< or reeled wage functions (i t\, tt,/ 
can always he set to unity in each year without affecting the ht of tire model) fly 
assuming, e g , that one slope coefficient remains constant over time or that the vari¬ 
ant e of u,t i villains tune lnvaiiant, it is possible to estimate 7r r/ given one normalization 
(tt,/ -- I ioi a panitulai yeai) from selection-cot tected wage functions Evidence 
in support of the propoitionality hypothesis (ttivanance of the slope coefficients of 
selec tion-cot retted wage func tions) justifies the procedure used to estimate task puces 
Notice that sepal ate values of X, can be estimated in each cross section irrespective of 
whether or not tt,/ can be identified Vein effects m the wage equation (the ($,>,/) can be 
estimated along with task prices <tt,/) if the latter are identified by assuming temporal 
invariance in slope oi variance paiameters. One yeai effect (p n ./) must be set to zero 
unless the uneicepi of (18) is deleted Note further that lor the case X, ¥■ (ifi — 1,2) the 
estimated tt,/ are indistinguishable from a very special type of technical change in the 
task functions that scales the slope coefficients and unobservables by a common pa ram- 
etei and shifts the intercept in a restricted way (see the iK/ above in eq. I*]). T he only 
wav to determine tf the estimated it,/ are valid prices is to see whether or not they act 
like puces in a behavioral equation Ihe evidence presented in Sec. Ill suggests that 
they do 

,M bee (1978) demonstrates that this parameter is not identified in a two-market- 
sector uttlily-nuiximuing lognormal Roy model. This lack of identification is a conse¬ 
quence of the introduction of new unobservables in the sectoral choice functions that 
arc not directly attributable to the unobservables in the wage equations. 
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maximum likelihood estimator of the parameters of this model is 
consistent and asymptotically normal. 19 

The estimated task prices (tr,/) can be used as input to estimate 
consistently the aggregate demand for task functions following the 
general methodology outlined at the end of Section 1. When ^ 0, it 
is possible under conditions stated in note 17 to estimate p 0 ,/ and -it,/ 
separately. In that case, equations (10) and (11) can be rewritten to 
account for the fact that tt,/ and p (W are not confounded. This point is 
discussed further below'. 


III. Empirical Estimates 

In this section we report empirical estimates and tests of the extended 
Roy model described in Section II and of the sectoral demand for 
aggregate task functions described in Section I. We use these esti¬ 
mates to explore the empirical importance of aggregation bias in 
obscuring aggregate real wage movements. We also assess the contri¬ 
bution of self-selection to inequality in the distribution of log wage 
rates. 

One convenient feature of our model is that it is not necessary to 
estimate the extended Roy model for all demographic groups in or¬ 
der to estimate task prices, -it,/, or sectoral task demand functions. 
Assuming that all units of task t are perfect substitutes irrespective of 
their demographic source, estimates of it,/ for one demographic 
group suffice to identify market task prices. 20 Dividing the aggregate 
wage bill for all demographic groups by the estimated task price pro¬ 
duces a consistent estimate of the total amount of the task supplied to 
the market that can be used in the estimation of aggregate demand 
for task functions. 

Another convenient feature of our model is that it is not necessary 
to estimate the extended Roy model for the reference demographic 
group for each available cross section. Assuming that the propor¬ 
tionality hypothesis is not rejected and the estimated model passes a 
goodness-of-fit test, we can estimate the slope coefficients of the 
model in a single cross section and fix these coefficients in other cross 
sections using the rest of the data to estimate year effects (the -y 0 ,r and 
[io,/) and the log task prices (In it,/). 


1 '* An anonymous relcree suggested that inequality (13) leads to a violation of classi¬ 
cal regularity conditions because the range of the random variable u* depends on 
parameters of the model. Inequality (13) requires only that the I, and w, Ire nonnegative, 
and so no violation of classical regularity conditions is induced by this restriction 
*" This assumes no market discrimination. By estimating the extended Roy model for 
separate demographic groups, we can test for market discrimination. It there is no 
market discrimination the estimated it ,i should be the same across different demo¬ 
graphic groups * 



10 q6 JOURNAL OF POLITICAL ECONOMY 

We exploit both features of the model to reduce the computational 
cost required to secure the empirical results reported below. We use 
prime age white males aged 18-65 as our reference demographic 
group. We test the proportionality hypothesis and perform goodness- 
of-fit tests for the model on two years of data (1976 and 1980). The 
evidence suggests that it is legitimate to constrain the slope coeffi¬ 
cients to equality in all years, using the remaining cross sections of 
data to estimate year effects and task prices. 

This empirical strategy substantially reduces the computational 
cost. However, this saving is secured by assuming what in principle 
can be tested: (a) that estimated task prices are identical across all 
demographic groups and ( b ) that proportionality and goodness-of-fit 
tests are passed for all demographic groups in all years. We leave the 
execution of such tests for another occasion, recognizing that the 
empirical results reported below may be overturned in a more exten¬ 
sive battery of tests. 

A. Estimates of the Extended Roy Model 

We estimate the extended Roy model on a 4 percent random sample 
of prune age white males taken from the annual March Current 
Population Survey (CPS) for the years 1968—81 inclusive.' 1 1 hese 
data are described in detail in Appendix C. When the extended Roy 
model is fit on the complete sample it is decisively rejected. The pro¬ 
portionality hypothesis is rejected, and goodness-of-fit tests indicate 
that the model does not fit the empirical log wage distributions. How¬ 
ever, when low-wage observations (persons whose real wages are less 
than $0.75 per hour) are deleted, the model is not rejected. The 
empirical tests reported in this paper are based on samples that ex¬ 
clude such observations. The likelihood function presented in Ap¬ 
pendix B explicitly accounts for this sample selection criterion. 

The estimated model parameters are presented in table 1. This 
table records estimates based on a pooled 1976 and 1980 sample. 
Individuals are classified into one of three sectors depending on their 
source of income for the year. Roughly 16 percent of the sample has 
no labor income in 1980. Individuals without labor earnings are 
defined as participants in the nonmarket sector for that year (sector 
$). Following census definitions, individuals are defined to be in the 
manufacturing sector if their SIC three-digit industry code is between 


Lillard. Smith, and Welch (1982) now the high nonreporting rate tor key eco¬ 
nomic variables in the CPS and discuss the imputation procedures used by the Census 
Bureau They demonstrate that there is a potential lor substantial bias in using imputed 
CPS data. We eliminate all imputed observations from our analysis. 
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107 and 398. Roughly 21 percent of the 1980 sample falls into this 
category. The rest of the sample (63 percent) is classified as working 
in nonmanufacturing. 

T he first two sets of rows of the table record the parameters of the 
contrast between the indicated sector reduced-form preference func¬ 
tion and the nonmarket sector preference f unction (the y in eq. [15]). 
The arguments include conventional determinants of wages (educa¬ 
tion, work experience, and work experience squared) plus a South 
dummy (= 1 if a person resides in the South and = 0 otherwise) to 
capture regional wage and amenity differences. In addition, pre¬ 
dicted nonlabor income is assumed to enter the preference function. 
Nonlabor income consists of all nonemployment income including 
unemployment benefits and social transfers. Nonlabor income is pre¬ 
dicted for each individual in each sector to account for the fact that 
entitlements to various social programs (e.g., unemployment insur¬ 
ance) are conditioned on sectoral participation and also to eliminate 
spurious correlation between nonlabor income and unobserved com¬ 
ponents of preferences. The predictor variables are presented in Ap¬ 
pendix C. The 1980 intercept is a dummy variable that equals one if 
an observation comes from the 1980 sample and is zero otherwise. (Its 
coefficient estimates y„,i for 1980.) The estimates reveal that educa¬ 
tion and work experience increase the probability of market partici¬ 
pation. These variables have a slightly stronger ef fect on participation 
in the nonmanufacturing sector than on participation in the manufac¬ 
turing sector. The South dummy has little effect on the nonmarket- 
manufacturing choice but a stronger effect on the nonmarket¬ 
nonmanufacturing choice. 

The coefficients on predicted nonlabor income are positive for both 
estimated sectoral utility functions, and statistically significantly so. At 
first sight this result is counterintuitive anti appears to indicate that 
leisure is an inferior good. Positive coefficients are consistent with 
Mommsen's (1977) entitlement effect in which individuals participate 
in a sector to collect sector-specific social benefits (e.g., unemployment 
benefits or workmen’s compensation). They are also consistent with 
the hypothesis that individuals are willing to forfeit income to enjoy 
the training or consumption benefits that accrue to individuals work¬ 
ing in specific sectors. 

The insignificant coefficients for the 1980 dummy variables indi¬ 
cate that 1980 reduced-form preferences do not differ in intercept 

*" Recall that predicted nonlabor income and not actual nonlabor income is used to 
avoid a spurious correlation between assets (and benefits) and sectoral preferences In 
many honestly conducted and reported empirical studies of male labor supply, leisure 
is found to be “inferior.” The argument in the text provides one rationale for this 
finding. 
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from 1976 preferences. The next parameters reported in table 1 are 
estimates of the covariance structure for the unobservables in the 
utility contrasts (v,, u 2 ). These unobservables are positively related— 
correl(i>i, v 2 ) = .29—and the variability of the unobservables in the 
manufacturing sector contrast, var(u 2 ), is smaller than the variability 
of the unobservables in the nonmanufacturing sector contrast (which 
is normalized to unity). 

The next two blocks of rows reported in table 1 report the 
coefficients of the estimated task (eq. [12]) or wage (eq. [18]) functions 
(the P). Except for predicted nonlabor income, the same variables 
that enter the sectoral choice equations enter the wage equations. The 
data on hourly wages are constructed by dividing annual labor in¬ 
come by estimated annual hours of work. 

Education has a strong positive effect in both sectors, but its effect is 
twice as strong in manufacturing. Wages grow much more steeply 
with work experience in the manufacturing sector than in the non¬ 
manufacturing sector over the empirically relevant range. The hy¬ 
pothesis of no wage growth w'ith work experience cannot be rejected 
for nonmanufacturing wages. The South dummy is statistically 
insignificant in both task functions. The 1980 dummy variable is sta¬ 
tistically insignificant for both sets of coefficients, indicating little dif¬ 
ference in the estimated intercepts of the task functions between 1976 
and 1980 (the for those respective years). 

The variance of u* is greater than the variance of u*. This is consis¬ 
tent with greater heterogeneity among the group of industries 
classified in the nonmanufacturing sector. 

The estimated log task price changes for 1980 indicate a 22 percent 
decline in the price of the manufacturing task from its 1976 level and 
a 21 percent increase in the price of the nonmanufacturing task from 
its 1976 level. The estimated transformation parameter for manufac¬ 
turing (\ 2 ) is positive, indicating that manufacturing wage rates do not 
have a Paretian right tail. The estimated transformation parameter 
for nonmanufacturing wages (\|) is slightly negative, indicating a 
Paretian right tail for wages in that sector. 

Even though both values of k, are estimated to be close to zero, a 
likelihood ratio test of the hypothesis that \| = k<> = 0 performed on 
the 1976 data rejects that hypothesis. The hypothesis is also rejected 
with the 1980 data.'’ 1 


23 Heckman and Polachek (1974) estimate a negative value of A for hourly wages 
using a Box-C'.ox procedure fit on data aggregated over both sectors. Their procedure 
does not account for the truncation discussed in App A or the censoring discussed in 
App. B. 

24 A direct test of the hypothesis X, = Ao = 0 for pooled 1976-80 data with period- 

specific intercepts for th^task function (pi W f or < = 1.2 and / = 1. L) and prices (it,, 
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Further evidence in support of our more general model comes 
from goodness-of-fit statistics for the extended Roy model. Using the 
fixed wage intervals described in the notes to table 1 to compute a 
goodness-of-fit test, we do not reject the hypothesis that the model fits 
manufacturing log wage data using a 5 percent level of significance. 
Compare the 34.1 statistic to the 42.7 statistic reported in the 
next set of rows in table 1 that results when A« = 0. Inspection of 
figure 3 reveals that the estimated model closely fits the pooled 1976 
and 1980 data. 

File performance of the extended Roy model in predicting log 
wages in noiimanufattiiring is also satisfactory. At a 5 percent 
significant e level we do not reject the hypothesis that the model fits 
nonmanufacturing log wage data. The x~ statistic for the extended 
Rov model is ti l .7, and for the normal model (A | = 0) it is 71.9. Figure 

4 reveals that the fit of the model to the nonmanufacturing data is 
i ather good. 

Heckman and Sedlacek (1986) compare plots of the extended Roy 
model with lognormal models with and without the assumption of 
utility maximization and with and without the presence of a nonmar¬ 
ket sector. We note that it is the addition of the nonmarket sector that 
substantially improves the fit of the model. 

The final row of table I reports the result of a test of a strengthened 
version of the proportionality hypothesis stated in Section I. Al¬ 
though the procedure used to estimate task prices requires only tem¬ 
poral invariance of some of the slope coefficients of the task function 
(12) or the variance of u*, or some other restrictions across time in 
these paiameters, an estimated model in which preferences shift 
about m each year for unexplained reasons would not he econom¬ 
ically very interesting. Accordingly, we lest for stability of the slope 
coefficients and covariance structure of the task functions and the 
pteference functions in 1976 and 1980. (The intercepts ol the prefer¬ 
ence functions might lx* expected to shift over time since they de¬ 
pend, intei alia, on task prices.) The restrictions tested here are thus 
much stronger than the ones requited to identify it,/. 

The statistic reported in the final row of table 1 is produced by 
comparing the likelihoods for the pooled J 976 and 1980 data with the 
sum of separate likelihoods (it for 1976 and 1980 separately. Using a 

5 percent significance level, we do not reject the strong propor- 


for i = 1.2, and 1= I.. ./.) raises a messy statistical problem As noted in n. 17, when 

Xi = Xj - 0, ii is not possible lo estimate separate p, w and it,/ parameters, and so some 
parameters of the model become unidentified. Conventional likelihood ratio tests do 
not possess classical limiting distributions. While it is possible to construct a test ot the 
hypothesis in this case (see Davies 1977), we have not done so here. 
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tionality hypothesis.~ r> When the model is restricted to a lognormal 
form (K| = Xy = 0) we decisively reject the hypothesis (see Heckman 
and Sedlacek 1986). 

Elsewhere (Heckman and Sedlacek 1986) we also decisively reject 
the proportionality hypothesis as a description of wage data aggre¬ 
gated over both market sectors (with self-selection ignored). An effi¬ 
ciency units assumption for labor quality does not describe the U.S. 
labor market taken as a whole. However, the empirical results just pre¬ 
sented indicate that an efficiency units assumption for sector-specific 
tasks is valid within manufacturing and nonmanufacturing sectors. 

B. Estimating the Demand (or Aggregate 
Sector-specific Task a 

I he extended Roy model could be ht to each available year of the CPS 
data (1968-81) to estimate the log task prices. In it,/, i = 1,2 and 

l — 1. L. To do so would be prohibitively expensive. Because the 

strong proportionality hypothesis is not rejected, it appears rea¬ 
sonable to fix all the parameters of the model at the values reported in 
table 1 except for the intercepts of the wage (p, w ) and preference 
functions ( 7 (W ) and the log task prices (In -it,/) and to estimate the 
intercepts and log task prices from each year of the available CPS 
data. Estimates obtained front this procedure are reported in Appen¬ 
dix l).- 1 ’ 

Civert a time series of In tt,/ it is possible to estimate the parameters 
of the sectoral demands for aggregate tasks using a modification of 
the procedure described in Section I. In making the modification note 
that when X, ^ 0 it is possible to separate In tt,/ from the intercepts of 

the task functions (the (3/,,/, i = 1,2,/= 1.L; see n. 17). Making 

this change we modify equation (19) to read 

ln(— ' -jr ~ -) = S 0 , + (S i, -4- I )(ln tr,/ - In P,/) 

(19) 

+ s " '"(“pH + 

25 When a separate model is fit in each year, the ir l( arc not identified (see n. 17). In 
eacli cross section 32 parameters are estimated (the number of parameteis reported in 
table I less the 1980 dummy variables and the TT,, terms). Thus 64 parameters are 
estimated in the unrestricted version. When ail but the intercepts and task prices are 
constrained to equality in the pooled 1976 and 1980 sample, 38 parameters are es¬ 
timated Consequently, there are 26 degrees of freedom reported in table I. 

2<i One reteree objected that the log task prices reported in App. D show too much 
temporal variability to be believed. This is an unusual argument in that neither tasks 
nor their prices are directly observed Surely the only way to judge whether or not an 
estimate of a price is valid is to see if the estimated price acts like a price in an estimated 
behavioral relationship. 
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i = 1,2 and l = 1 ,,L, where In it,/ is the intercept estimated from 
the microeconomic log wage equation fit in sector i in year / (eq. [8|) 
and e d differs from e,/ by the estimation error of Inir,/ for In t r,,; e, t = ?,/ 
+ (1 + 8 1( )(ln Tt,i — ItPtr,/). We assume that the task price for white 
males is the task price for all demographic groups. 

Estimates of the sectoral aggregate demand for task functions 
(eq. [19]) are presented in table 2. The sectoral wage bill data come 
from U.S. Commerce Department data on total wages paid. (For fur¬ 
ther information on these data, see App. C.) The log of the real wage 
bill divided by real product price for each sector is regressed on die 
logs of (1) the estimated task price, (2) an index, of energy prices, (3) 
an index of intermediate goods prices for the sector, and (4) the user 
cost of capital. Each of these prices is deflated by the real product 
price. Definitions and data sources for these variables are presented 
in Appendix C. 

As noted in Section I, it is implausible that the In ir,, and H,/ are 
exogenous variables in the aggregate task demand functions. Instru¬ 
mental variable estimates based on the set of instruments recorded in 
the notes to the table are given in the right-hand-side columns of table 
2. Not surprisingly, the instrumental variable estimates are less pre¬ 
cisely determined than are the least-squares estimates. Note that the 
instrumental variable estimates of the elasticity of demand f '01 un¬ 
measured aggregate tasks are very close to the ordinary least squares 
(QLS) estimates, indicating that simultaneous equations bias is not 
present.' 7 This empirical result is robust to a variety of choices of the 
set of instrumental variables. For this reason we focus our discussion 
on the OLS estimates. 

The estimated elasticities of demand are negative and statistically 
significantly different from zero and thus are in accord with the pre¬ 
dictions of economic theory. 2 ” The Durbin-Watson statistics indicate 
that serial correlation is not a problem and the R 2 ’s are high. 

It is important not to make loo much out of these estimated de¬ 
mand functions. After all, there are only 14 time-series observations 
for each sector, and the number of degrees of freedom in the time 


27 A Durbin (1954) lest does not reject this hypothesis. 

2K The estimated nonmanufat luring sector elasticity is also significantly diffeienl 
from - 1, although this is not the case for the manufacturing elasticity. Thus the 
normalized wage bill in rionmanutacturing is significantly related 10 In n,, (i.e., the 
estimated value of 8,! + 1 is statistically significantly diiferenl from zero). At least for 
nonmanufacturing we can reject the argument that our estimated demand elasticity is 
the spurious product of a procedure that subtracts 1 from a coelficienl that is not 
statistically different from zero (the estimated value of 5,, + 1 in eq. 119]) and finds that 
the insignificant coefficient minus 1 is not statistically significantly dillerem from — 1. 
Note, however, that we cannot reject this argument lor the estimates tor the mainitac- 
turing sector. Of course, it is possible that the true elasticity of demand for manufactur¬ 
ing is — I. There is no way to use these data to determine whether or not the estimated 
relationship is spurious. 
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series is small. The specification adopted here abstracts from the dy¬ 
namic costs of adjustment that have been found to be important in 
other studies of labor demand. Nonetheless, our simple model ap¬ 
pears to be consistent with the limited time-series data at our disposal. 
It is possible that in a longer time series with more degrees of freedom 
a more dynamic model of factor demand would be required to pro¬ 
duce an acceptable fit of the data. 

C. Exploring the Importance of Aggregation Bias 
in Aggregate Wages 

The apparent lack of aggregate wage variability over the cycle for 
U.S. data may be a consequence of aggregation bias (see, e.g., Stock- 
man 1983; Bils 1985). Since low-wage workers are the "first to go” in 
response to a downturn in demand, the lack of variability in measured 
average wages may partly reflect an employed worker quality compo¬ 
sition effect. 

Our model can be used to investigate the empirical importance of 
aggregation bias. For the U.S. manufacturing sector we find strong 
evidence of aggregation bias leading to an attenuation of measured 
average wage movements in relationship to the true “quality constant” 
movement in task prices. However, for the economy as a whole, just 
the opposite effect occurs. Aggregation bias increases measured wage 
variability in relationship to the underlying movement in quality con¬ 
stant task prices. 

The manufacturing sector is harder hit by an aggregate distur¬ 
bance such as an oil price increase than is the nonmanufacturing 
sector. Employment declines in the manufacturing sector. Some of 
the former manufacturing workers enter the nonmanufacturing sec¬ 
tor rather than drop out of the work force altogether. The former 
manufacturing workers turn out to be at the bottom of the manufac¬ 
turing task quality distribution, and their exit raises the average qual¬ 
ity of the remaining manufacturing work force and hence attenuates 
the decline in measured average wages. This is the conventional 
aggregation bias effect discussed in the literature. 

However, the former manufacturing workers who enter the non¬ 
manufacturing sector turn out to be at the bottom of the task qua¬ 
lity distribution in that sector. The new entrants lower the average 
quality of the work force in nonmanufacturing. The reduction in 
quality more than offsets the increase in task price in that sector. On 
net, aggregation bias exaggerates the aggregate decline in real wages 
over what it would be if task quality were held fixed. This effect is 
ignored in macroeconomic studies that neglect labor heterogeneity 
and self-selection. , 
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Table 3 presents estimates of the impact of a 1 percent increase in 
the price of energy on employment, average sector task quality, task 
price, and average wages for each sector and for the economy as a 
whole. If estimates of the supply functions for all demographic 
groups were available, it would be straightforward to simulate the 
model. Recall, however, that we have estimated supply functions for 
only one demographic group—white males—and so we cannot use a 
direct simulation approach. Instead, we estimate a reduced-form 
equation relating log task prices to energy prices and other determi¬ 
nants of aggregate task demand and supply. These reduced-form 
equations are presented in Appendix E. Assuming structural in¬ 
variance of the parameters of the economy, we can use our estimated 
log task price equation to estimate the effect of an energy price 
change on sectoral task prices. The numbers reported in table 3 are 
based on such reduced-form equations. 

The numbers reported in the first column (for the manufacturing 
sector) and the first panel (for 1972) indicate the following response 
to a 1 percent increase in energy prices: (1) employment in manufac¬ 
turing decreases by 1.854 percent; (2) the average task quality of 
workers employed in the sector rises by 0.919 percent; (3) the task 
price declines by 1.48 percent. Adding effects 2 and 3, we would 
observe average manufacturing wages to decline by only 0.561 per¬ 
cent. Two-thirds of the decline in the manufacturing task price is 
offset by a change in the quality of the work force. The composition 
bias effect in manufacturing is roughly of the same order of mag¬ 
nitude for the two other years (1976 and 1980). 

For the nonmanufacturing sector in 1972, a 1 percent oil price 
increase raises employment, lowers average employed task quality (by 
1.49 percent), and raises task price (by 0.471 percentThe pre¬ 
dicted decline in average wages in the sector is to be compared with the 
forecast increase in the nonmanufacluring task price. Similar results 
are found for 1976 and 1980. 

For the economy as a whole (the third col. of the table), the task 
quality constant wage change is defined to be a weighted sum of the 
sectoral task price changes, where the weights are the employment 
proportions in each sector in the appropriate year. In 1972 the 
simulated aggregate wage decline ( — 0.950) is much larger than the 
skill constant wage change ( — 0.062). These simulations suggest that 
aggregation bias may be empirically important. However, its effect on 


2U Recall that an increase in the price of energy increases the demand for non¬ 
manufacturing tasks (see table 2). These demand functions reflect a shift in relative 
demand from manufacturing to nonmanufacturing in response to a change in energy 
prices. 
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aggregate wage variability is opposite to that conjectured in the recent 
literature, which ignores the effect of self-selection or the pursuit of 
comparative advantage. Aggregation bias increases measured wage 
variability.' 0 

/). Assessing the Impact of Self-Selection on Inequality 
in l.og Wages 

In this subsection we use our estimates of the extended Roy model to 
assess the impact of self-selection on inequality in market wage rates 
for employed white males. A commonly used measure of inequality— 
the variance in the natural logarithm of wages—and a prototypical 
year, 1980, are selected to make this assessment. We compare the 
observed variance in log wages (which is close to the variance pre¬ 
dicted by the model) with the variance in log wages that would result 
if people were randomly assigned to manufacturing, nonmanufactur¬ 
ing, or nonmarket activity in a sense to be made precise below. 

The first column of table 4 presents predicted values of sectoral and 
cTonomywide means and variances of log wage rates. Actual values 
front the March 1980 CFS data are given in the second column. 
Notice that there is close agreement between actual and predicted 
values. The economywide variance is broken down into two compo¬ 
nents: {a) variability within sectors and (h) variability between sectors. 
The formula for the variance decomposition is given in the notes of 
the table. Note that virtually all of the total variance in log wage rates 
is due to withm-sector variability (.99 = .288/.291). 

The final column of table 4 presents values of sectoral and econo- 
mywide means and variances of log wage rates for the random assign¬ 
ment economy. This economy is constructed by randomly assigning 
people so that (a) the proportions employed in each sector are set to 
be the same as those predicted in the sectors in 1980 by our equilib¬ 
rium model and (b) sectoral task prices for the hypothetical economy 
are set at 1980 values, an assumption that is strictly defensible only if 
the aggregate task demand functions are perfectly elastic as in the 


‘"Our conclusions appear to be at odds with those ot Stockman (1983) and Bits 
(1985) Both conclude that there is little evidence ot aggregation bias in aggregate 
wages. Stockman excludes nonworkers from his sample and thus induces a sample 
selection bias problem, which he notes but does not solve. Bits includes nonworkers in 
his analysis and corrects lor selection bias assuming that wages are lognormal Recall, 
however, lhat our tests re|ect the lognormal model Neither author corrects tor the 
etleci ol sectoral sell-selection decisions Our procedure, which adjusts for selection 
bias in a nonnormal model and accounts for the ellect ot comparative advantage on 
measured wages, produces much stronger evidence of aggregation bias than do these 
other studies. 



Assessing the Impact of Self-Selection 




P]g] + P 202 * r PiP^Mi - Ai?) 5 
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original Roy model. While this definition of a random assignment of 
workers is arbitrary, so is any other candidate definition. The virtue 
of the definition we select is that it takes as its point of departure the 
configuration of data actually observed in 1980. 11 Note that assump¬ 
tion b is not strictly required for computing within-sector variances 
since task prices do not affect within-sector variances of log wage 
rates. 

In the random assignment economy, the difference between the 
sectoral means of log wages is much greater than it is in the self¬ 
selection economy (0.317 vs. 0.145). Self-selection decreases the vari¬ 
ance of log wages within each sector over the case of random selection 
(by 8.3 percent in nonmanufacturing and 9.1 percent in manufactur¬ 
ing). Recall that a reduction in sectoral variances is predicted by the 
Roy model but is not imposed on the data by our more general model. 
It is interesting that this qualitative feature of the Roy model is consis¬ 
tent with the data. 

For the economy as a whole, self-selection reduces inequality. 
Within-sector inequality (summed over both sectors) declines by 7.4 
percent. Because of the dramatic compression in the means of sec¬ 
toral log wages, self-selection reduces the between-sector variance by 
83 percent. Oveiali, self-selection reduces inequality (the variance in 
log wages) by 11.5 percent (from 0.329 to 0.291). 

IV. Summary 

This paper derives and estimates an empirical equilibrium model of 
self-selection in the labor market that recognizes the existence of mea¬ 
sured and unmeasured heterogeneous skills within even narrowly 
defined demographic groups. We derive a model of the sectoral allo¬ 
cation of workers of different demographic types. We present a new 
econometric procedure that combines micro and macro data to esti¬ 
mate supply and demand functions for unmeasured productive attri¬ 
butes. Our estimated demand equations are downward-sloping func¬ 
tions of task prices. 

Our methodology extends previous statistical work on self-selection 
to an explicit market setting in which the prices of attributes respond 
to changes in the determinants of aggregate demand and supply. Our 


M If we had estimated the labor supply functions for all demographic groups it 
would he [Hissible to compute equilibrium prices for tasks given a particular allocation 
of workers across sectors. However, since we estimate the supply function for only one 
demographic group, this procedure is not available to us. 
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model extends previous empirical work on wage equations by in¬ 
troducing determinants of aggregate market demand and supply into 
an explicit, economically interpretable estimating equation. We ex¬ 
tend Roy’s model of self-selection by embedding it in a market setting 
and by (a) introducing a nonmarket sector, ( b) allowing workers to 
select their sector of employment on the basis of utility maximization 
rather than income maximization, and (r) permitting unmeasured 
attributes to be nonlognormally distributed. These extensions are re¬ 
quired to produce a model that fits data on wage distributions from 
the U.S. labor market. 

This study presents empirical evidence that justifies the commonly 
utilized practice of aggregating manufacturing into a single sector for 
the purpose of estimating labor demand functions. However, a new 
aggregate is required that recognizes both measured and unmea¬ 
sured heterogeneity in skills in the population and that accounts for 
self-selection decisions by agents. 

We use our model to estimate the importance of aggregation bias in 
measured aggregate real wage rates. Aggregation bias reduces mea¬ 
sured wage variability in manufacturing below what it would be if the 
quality of the manufacturing work force were held constant. How¬ 
ever, for the economy as a whole, precisely the opposite effect occurs. 
Aggregation bias causes measured aggregate wage variability to over¬ 
state quality constant wage variability. Because of comparative advan¬ 
tage, workers who move from one sector to another in response to a 
macro disturbance lower the average quality of the work force in the 
sector to which they go and raise the average quality in the sector 
from which they depart. This phenomenon accentuates measured 
wage variability over what it would be if sectoral labor force quality 
were held constant. 

We also use our model to assess the contribution of self-selection (or 
the pursuit of comparative advantage) to inequality in log wage rates. 
We find that self-selection reduces aggregate wage inequality by more 
than 10 percent. 


Appendix A 

The Box-Cox Transformed Truncated Normal Model 

The joint density of (q, < 2 ) is derived from equations (12) and (13) assuming 
that (uf, u*) are joint normal random variables. Define (u*, uj) ~ N(0, X u *). 
Let , a 2 ; p) be the cumulative standardized bivariate normal with correla¬ 
tion coefficient p, where a\ and a 2 are upper limits ot integration. 

The joint density of (q, q) given x is 



1114 

f(t \. <•>) = 


JOURNAL OF POLITICAL ECONOMY 




exp 


■V,{ 




P,x + p,_,x + — 

(sgn A.)- 1 —, (sgn A.,)-——; (sgn A,)(sgn X 2 )p la 


(Al) 


‘I’ 


<<T] j )‘ 




wheie pij is the correlation coefficient Itetweei) h* and ;/*. 
The marginal density of /1 given x is 
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For A ( = 0, this density specializes to the lognormal (we adopt the convention 
ihioughoiil these apps. tiiai sgn A, = 0 when A| = 0). For fui ther discussion 
see Heckman and Sedlacek (198<i). 


Appendix B 


The Generalized Roy Model 


This appendix presents the generalized Box-Cox model. We also write out 
the likelihood function for the model. Let d, = 1, t = 1, . . . !l, if the 
appropriate inequality in (17) is satisfied and zero oihetwise. 

Define 

COV(V| — CJ ., U|) COV(U> — VJ|,ll')) 

Pi = - : -, p 2 = ----- 1 -■ 

[var(V|)var(vi — vs.)]'-- [var(v 2 )var(vi| — u>)] u i 


Pa 


COV(V|, V'j) 


[var(vi)var(u 2 )] ,/! 

In the notation for ihe bivariate probit introduced in Appendix A, 

7if - y J 7if 


pr(di = l|f) = <t> 
pr(r/ 2 = 1 |f) = <t> 


[var(V| — v 2 )] 1, ‘' [var(ui)]'A 

y-jf - Yif yit 


[var(V| - u 2 )] m i [var(vj. 2 )]'- 
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P‘2 
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~ Yif -"ttf 

[var(v,)]'V (var^)] 1 * ’ Ps 


Throughout we assume that var(U|) = t. 


(Blr) 


1. The Density of Accepted Wages in the Box-Cox Model 

We derive the density of accepted wages in sector 1. The density of sector 2 
accepted wages can be derived by a parallel argument. 

Agents enter sector 1 provided that inequalities (17) for t = 1 are satisfied. 
These inequalities restrict the range of normal variates (u, - v 2 , v,). The 
underlying u* is restricted by the inequality (13) presented in the text. Let 


correlfw*, — Uo) = p* 2 , correl(«*, Vj) = p* I( A = 
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In this notation, the conditional density of w,, using w, = tt, t\ and (12), is 
g(tei|7if “ y-jf + i>i - u 2 > 0, 7if + V| > 0, Xpt* > -\,P]X - 1) 
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where <lJ(a, b, c ; d, e,j) is a trivariate normal integral with upper limit a, b, c 
and correlation structure d, e, j. Letting 

— yif — Yif 

correl(u*, v*, - = p$,, correl(u^, u 2 ) = p 22 , A = —---——, 

[var(u 2 - vi)] 1 ^ 
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B = 
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the density of accepted wages in sector 2 is 
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2. Auepted Wage Distributions When Wages Are above a Threshold 

We modify the densities in Section i to account for a sampling rule that w, 
> r, where t is a minimum threshold value. We derive the sector 1 accepted 
wage distribution. The derivation of the sectoi 2 accepted wage distribution 
follows by a parallel argument. 

The requirement > t > 0 translates into the restriction 

h’i = -nitXiPiX + X|ii* + 1) I,X| > t 


or 



(B4) 


for all values of X] not equal to zero. Combining restriction (13) with (B4) and 
assuming that X| < 0 implies that 
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Notice that as X, 0 from below this inequality becomes 


_-(M + In (-irjy^' 
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When X| > 0, (B4) will be the appropriate inequality; that is, (13) imposes no 
extra restrictions on the range of u* beyond the one already imposed by (B4) 
Letting 
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the density of accepted wages in sector I given if, > t and X, > 0 is 
g(u)||Yif - Y 2 f + V| - > 0, Yif + v, > 0, a', > t, Xi > 0) 
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The density of accepted wages in sector 1 given n't > t and X, < 0 is (B.i) 
multiplied by 

_ $[ -(C + D), 4,0; p*^, p*i, pi] _ 

’PlC.A.B; — P’ 12 , ~P*n Pi) _ <P(C + D,A.B , — pi.;. P* 1 - Pi) 


Recall that D < 0 if X, < 0. The density of gfu'alY^f - Yif + W ~ v, > 0. y>t 
+ U '2 > 0, w-2 > t, X'j < 0) is derived by a parallel argument. 


3. The Likelihood Function for Our Model for a Sample Consisting 0 / All 
Nonu’orkers plit s Workers with Wage Rates above a Threshold 

In this section we utilize results derived in Sections 1 and 2 to write out the 
likelihood function used to estimate our model. An individual is in our sam¬ 
ple if he chooses to go to sector 1 and his wages are above t, if he chooses to go 
to sector 2 and his wages are above t, or if he chooses to go to sector 3. Denote 
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by CP the probability that an individual with characteristics f satisfies one of 
these three criteria Then 

CP = pr(</| = 1, ut) > t| f) + pr(d L > = 1, w 2 > t| f) + pr(c/| = l|f). (B6) 

Hie probability that an agent chooses sector 1 and has a wage above 
threshold t given that A| > 0 is 

prfrf] = I, u'i > t|F, X| > 0 ) = <b(/l, B, pi) 

<h[-(C + D),A,B-, p1 2 , p1i, Pl ] ( B? ) 

<hf-C\ A.B\p* V2 , pf,. P i] 


For \| < 0, the desired probability is 

|>r(r/| = I, «’| > r|f, A] < 0) = 4>(/l, B, 

A, B; -pt,,. -pf,.p,) -d>(C + l),A,B, -pf 4 , -ptt.p,) (B7) ' 
<1>(C.A,B\ - p*2, — P*i. Pi) 


For set lor 2 
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1 lie ptobabiltty that an individual will be observed in sector 1 given that he 
is in the sample, defined as /'L, is 


/’I 


_ P»M! = Ui> T|f) _ 

jir(t/| = 1, u'i > t|F) + pr(f/-j = 1, «’>_> > T|f) + pr(rf| = l|f) 


lhe contribution to likelihood function L of an individual with characteris- 
tic s f observed in sector 1 is 


/•i = g(«'i|f, d\ = 1, u'i > t) • PI, (B10) 

where g(ti'i|f, d\ = 1, u'i > t) is defined in (B5). 

By a parallel argument the probability tiiat an individual will be observed in 
sectot 2 given that he is m the sample, defined as P 2, is 


P‘2 = 


prfdi 


pi Uli — 1 , W'i > r|f) 

J, U'i > r|f) + pr{d‘, = 1, u '2 > f|f) + pr(c/ :) = 1 |f) 


(Bll) 


The contribution to likelihood function L of an individual with characteris¬ 
tics f observed in sector 2 is 

/■. = g(w. 2 \f, d-, = 1. u-2 > t) ■ P2, (B12) 

where gfui-jjf, d 2 = I, w 2 > t) is defined analogously to (B5) and P2 is defined 
m (Bll). 
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The probability that an agent with characteristics f chooses sector 3 (the 
nonmarket sector) conditional on being in the sample is 


PA 


_ pr(<f s = l|f) _ 

pr(d| = 1, w t > -r|f) + pr(d 2 = 1, ut 2 > -r|f) + pr(ds = l|f) 


(BIS) 


The contribution to likelihood f unction /. of an individual with characteris¬ 
tics f observed in sector 3 is 


L, = Pi, 


(BI4) 


where Pi is defined in (B13). 

The likelihood satisfies classical regularity conditions because it is twice 
continuously differentiable in the parameters of the model, and the range of 
each random variable does not depend on the parameters of the model. 


Appendix. C 

Description of the Data 

1. CPS Data 

The sample utilized to estimate the extended Roy model is derived from the 
March Current Population Survey (CPS). From the CPS data base of 1968-81 
we randomly select a 4 percent subsample of civilian white males between the 
ages of 18 and 65. In constructing our sample we eliminate any observation 
with imputed data for any of the variables utilized in the analysis. 

The following variables are extracted from the CPS data file: annual labor 
income last year, hours worked last week, number of weeks in the labor force 
last year, total income last year, years of schooling, age of the person, three- 
digit industry code of last year’s |ob, and current state of residence. Total 
income and labor income variables are transformed into real variables by 
dividing by the CPI; we use 1967 dollar constant values. 

We construct two variables: hourly wage rate and income from nonlabor 
sources. The hourly wage rate is obtained by dividing the labor income the 
respondent obtained in the year prior to the interview by the product of the 
number of weeks he was in the labor force in that year and the number of 
hours he worked in the week prior to the interview. The income from non¬ 
labor sources is obtained by subtracting the labor income lrom the total in¬ 
come, both defined for the year prior to the interview. The sectoral nonlabor 
income obtained is then regressed on the following exogenous variables: age, 
education, stale of residence, and polynomials of these variables. The pre¬ 
dicted value lrom this regression is then utilized as a regressor to avoid 
spurious correlation between assets and the unobservables in the choice equa¬ 
tions. 

As noted in the text, we exclude all individuals whose real hourly wages are 
below $0.75. The lower tail of the hourly wage distribution is excluded to 
minimize the effects of measurement error. 


2. Industry Data 

The following data series are utilized to estimate the industry task demand 
functions; industrial commodity price index, farm products price index, in- 
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termediale goods price index, energy price index, nonresidential fixed invest¬ 
ment price deflator, corporate bond (Moody's) Aaa yields, total population by 
demographic group, average hours worked in each sector, and number of 
wotkers and hourly wages in the manufacturing and nonmanufacturing sec¬ 
tors. These series are obtained from the Statistical Abstracts of the United States 
(11)68—hi) and Historical Statistics uj the United States—Colonial Times to 1970, 
both published by the U.S Department of Commerce, Bureau of the Census. 

I he infor mation required to estimate the industry demand equation de¬ 
scribed in the text is data on output price indices for both the manufacturing 
and nonmanuiacturing sectors. The industrial commodities price index and 
the farm output price index are used as output price indices for the manufac¬ 
turing and nonmanufacturing sectors, respectively. 

The total wage bills in the manufacturing and nonmanufarturing sectors 
are obtained as the product of total number of employees times the average 
hourly wage they receive times the aveiage number of hours worked in each 
sec tor. 


3. Data Sets for the Simulations Reported hi Sections lllC. and IIID 

The simulation results reported in the text tequire the empirical distribution 
of exogenous characteristics in (lie population. 1 hese distributions are ob¬ 
tained from a ‘2(1 percent random sample derived from die Cl’S data file for 
the period 1068-81. The variables selected are age, years of schooling, and 
state of t esidence. Individuals with missing data lor any of these three vari¬ 
ables aie excluded fiorn the sample. The estimate of income from nonlabor 
sointes is obtained by regressing nnnlalxn income in the population on the 
exogenous variables described above and polynomials of those variables. 


■1 Definitions of the Variables Utilized in the Analysis 

Hourly wage rate — total labor income/fweeks x hours). 

Weeks — weeks in the labor lorce in the previous year 
Hours — hours worked in the week prior to the interview. 

Nonlabor income = total income — total labor income 
Total income = total income of the respondent in the previous yeat 
I otal labor income = wage and salary income + nonfarm self-employment 
income + farm self-employment income (all in the previous yeat). 

Education = years of schooling. 

Kxperience = age — education — 6. 

South - 1 if the respondent was living in the U.S. Census South at the time of 
the interview, 0 otherwise. 

Sector choice = 1, working in a nonmanufarturing industry; 

= 2, working in the manufacturing sector, last year three-digit 
industry code falls between 107 and 398; 

= 3, not working, has /tero total labor income in the previous 
year. 

Energy price index = producer price index for energy. 

Intermediate goods price = intermediate goods price index. 

User cost of capital = nonresidential fixed investment price deflator times the 
corporate bond (Moody’s) Aaa yields. 
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Appendix E 

Notes on the Computation of the Aggregation Bias 
Simulations Reported in Table 3 


The simulations reported in Section l IIU in the text are performed using the 
parameters estimated for the extended Roy model (reported in table 1) with 
task prices and the intercepts of the utility functions adjusted to take into 
account the energy price increase. 

The impact of an energy price increase on the supply of tasks is assumed to 
operate only through its effect on the task, prices. Estimates of the intercepts 
of the reduced-form utility functions include the effect of task prices on the 
sector-specific utility of agents. 

To compute the response of task prices to changes in the eneigy price we 
regress estimated log task prices on log energy prices and other determinants 
of the equilibrium market price. The estimated price equations (with stan¬ 
dard errors in parentheses below the coefficients) are: 

In tt 1 / = —.0736 + .471 - (log energy price index) — 2.021 

(.350) (1.145) (7.232) 

• (log intermediate goods price) + .8349 • (log user cost of capital), 

(. 811 ) 

R 2 = .2810, D-W = 3.046: 

hT-rT^i = .18611 - 1.4800 • (log energy price index) — 2.934 
(.350) (1.09) (6.912) 

• (log intermediate goods price) + 1.6894 ■ (log user cost of capital), 

(.795) 

R 2 = .3939, D-W = 2.91. 


(The variables are defined in App. C.) Essentially the same empirical results 
are obtained if time trends are included in the regression. 

In order to estimate the eflects of task price changes on sectoral choices it is 
necessary to decompose the estimated year effects in the utility functions (the 
-yi),/) into two components: the contribution of log task price and the contribu¬ 
tion of unobserved supply characteristics. We approximate the latter by lime- 
trended variables: a time trend and the unemployment rate in the United 
States. The regression of the estimated intercepts on the estimated log task 
prices, a time trend, and the unemployment rate (standard errors in paren¬ 
theses) are: 


y on = 


.543 + . 154 ■ (In iru) ~ .0051 • (time trend) 
(2.14) (.153) (.0256) 


+ .0515 - (unemployment rate). 

(.0649) 

R 2 = .3280, D-W = 2.46; 

\yii = —-712 + .0502 • (In ir 2 i) + .009 • (time trend) 
(.732) (.0703) (.009) 


+ .0204 • (unemployment rale), 
(«0384) 

R 2 = .2852, D-W = 1.66. 
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The estimated coefficients imply that a I percent increase in the price of 
energy decreases the manufacturing sector task price by 1.48 percent and 
increases the nonmanufacturing sector task price by 0. 47 percent , and chat 
the intercepts in the utility f unction will shift by 0.072 in the nonmanufactur- 
.ng sector (= 0.471 x 0.154) and by -0.074 in the manufacturing sector 
(= -1.48 x 0.0502). 
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Changing World Prices, Women's Wages, and 
the Fertility Transition: Sweden, 1860-1910 


T. Paul Schultz 

) nir L'tiwnsity 


I his pupei identifies demand-induced changes m the price of wom¬ 
en's time as a factoi determining the feitility transition. Changes in 
wot Id prices of grains and animal products in the 1880s affected the 
composition of Swedish output and labor demands. The increase in 
die price of hulter relative to grains improved women's wages rela¬ 
tive to men's and contributed thereby to the decline in leitilitv. 
When child mottahtv, urbanization, and the real w.iges of men ate 
held constant, aggregate ccnmlv-level data lor a 50-vear period, 
1800- 1010, suggest tft.it this exogenous appreciation in the value ol 
women's time relative to men’s explains a cpiartcr ol the concurrent 
decline in Swedish leitilitv. 


Ktonomic explanations for the historical fertility decline in the West 
generally assume that the price of children has increased relative lo 
other goods and that, in the view of parents, some ot these other 
goods substitute for having many children. Moreover, this substitu¬ 
tion Ijv parents away from a large family because of the price change 
must he sufficiently strong to offset the tendency conjectured by 
Adam Smith (I77f>) and Thomas Malthus (1798) for fertility, or at 
least a woman’s surviving oflspring, to increase with real wages and 
rising standards of living. To argue that a change in the relative price 
of children has "explained” the secular trend in fertility, the price 
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change must come from outside the family or household sector; 
namely, it must be exogenous to parents and be independent of their 
preferences and behavior. 1 To give this economic interpretation more 
content, a market-dictated change in price must be specified that is 
reasonably linked to the rising price of children and to the consequent 
fertility transition. 

It is widely conjectured that an important element in the rising 
opportunity cost of children is the market value of women’s time 
relative both to market goods and services and to the market value of 
men's time (Becker 1960; Mincer 1963). Most empirical evidence as¬ 
sembled in support of the hypothesis that the rising value of women’s 
wage opportunities contributes to the secular decline in fertility is 
based on cross-sectional correlations, where women’s wage opportu¬ 
nities are either proxied at the individual level by women’s education 
or measured at the aggregate level by the regional wage rate, holding 
constant in both cases men’s wages, income, or education and a vari¬ 
ety of other possible fertility determinants, such as the level of child 
mortality (Knodel 1979; Schultz 1981). In the first case, (here are 
many noneconomic interpretations for the cross-sectional association 
between women’s education and their fertility, and a deeper analysis 
is called for to indicate whether this association is predominantly a 
reflection of economic attributes of more educated consumers and 
producers. In the second case, the aggregate wage received by work¬ 
ing women is influenced by the number and type of women seeking 
work and their conditions of work. Women’s wages may be depressed 
(elevated) by their increased (decreased) aggregate labor supply and 
net migration, as well as by their selective characteristics. 2 If women 


1 Consequently, it is not possible to draw cause and effect inferences from the obsei- 
vation that parents who have fewer children also do other things, such as invest more 
resources in the education of each child; nor is it possible to say that their values 
regarding children are different. Both fertility and these other responses may be inde¬ 
pendently or jointly affected by an exogenous third factor, and thus the correlation 
between these family responses may not be observed in other circumstances. 

1 For example, women's wages may be relatively high in an area because the area's 
natural resources and the prises for its output favor the employment ol a 1 datively 
large number of female workers in, e.g . dairying, food processing, or textile manufac¬ 
turing. These resources and relative prices would represent the derived demand side of 
the labor market, and their change over time could be attributable to technological 
change, institutional evolution, discovery of new resources, and possibly long-run 
climatic change. Alternatively, women’s wages could be relatively high as a consequence 
of other factors that induced relatively few women to participate m the market labor 
force or that encouraged a predominantly female emigration from the region. These 
labor supply effects could provide a second explanation for why working women would 
receive relatively high wages. Alternatively, where women participate in the labor force 
more frequently and steadily over their life cycle, they accumulate labor market experi¬ 
ences that are more similar to those of men, and consequently women's wages rise 
relative to men's. In*cither case, the endogenous shifts in labor supply should be 
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enter jobs where investment in market skills and on-the-job training 
becomes important, their observed wages over their employment 
(life) cycle may deviate systematically from a static concept of mar¬ 
ginal productivity, or the average opportunity value of all women’s 
time may differ from the wage accepted by working women (Smith 
and Ward 1985). 

To distinguish between the supply and demand determinants of 
wage rates, information is needed on a force outside the local econ¬ 
omy and society that autonomously causes relative prices or techno¬ 
logical possibilities to change in a quantifiable manner, thereby affect¬ 
ing only the aggregate derived demand for female labor relative to 
that for male labor. Any ensuing variation in relative wages for 
women and men thus identified may then be treated as demand in¬ 
duced and independent of labor sujrply behavior. The crux of the 
economic demand hypothesis is that the output-price-driven changes 
in the structure of production explain women’s historically changing 
economic and domestic roles, with fertility adapting to these econom¬ 
ic ally modified opportunities.' 1 As an alternative explanation, the fer¬ 
tility transition inay be seen as a series of changes in individual supply 
behavior induced by changes in culture (t’.oale 1978; Mosk 1983) or 
by the secularization of society (Lockridge 1983), changes that simul¬ 
taneously alter repioductive goals and encourage women to engage in 
lifetime labor market commitments, with the incidental on-the-job 
training effect of raising the market wages of women relative lo men. 
This paper argues that exogenous international price changes pro¬ 
moted change in fertility and women's labor market roles in Sweden, 
and these accurately monitored inducements for the composition of 
output and labor demands to change can be confidently treated as 
exogenous to the Swedish society. If the empirical case for Sweden is 


somehow netted out of wage changes if one is to obtain a satisfactory lest of the 
economic explanation of the lerulitv transition, which posits that exogenous shifts in 
efemand cause the rising value of women's time relative to men's 

1 Snell (1 'JHI) describes gradual c hanges in women's roles in England in the eigh¬ 
teenth century as induced by shills ill the relative prices of dairy products vs. food 
grains (similar to the shifts 10 be described below), with repercussions cm marnage 
patterns and possibly lerlihty "The historical determinants ol women's economic and 
domestic roles would appear to be located puniarily in the seemingly autonomous 
changes in the structure of the economy, rather than in shifts of social attitudes" 
(p 436). Men and women were of course both employed in the production of lood 
grams and dairy products, but they are assumed to be imperfect substitutes for each 
other and women are hypothesized 10 exhibit a comparative advantage in dairying 
Child labor may also be a close substitute for female labor in some productive activities. 
Coincident movements in wages of women and children might then weaken the inverse 
relationship anticipated here between women’s wages and fertility since higher child 
wages would presumably encourage larger families. Unfortunately, 1 have not found 
any wage series for children in Sweden to explore these issues, which were suggested by 
Peter Linde rt. 
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persuasive, that this price development contributed to the onset of the 
fertility transition, it may help to explain the common timing of the 
fertility transition in many other northern European countries that 
confronted similar problems of domestic economic adjustment to the 
same shift in world agricultural prices in the latter half of the 
nineteenth century (e.g., Ankarloo 1979; Matthiessen 1983). 


The Economic and Demographic Setting 

From about 1750 to 1850 the price of grains in Europe increased 
relative to agricultural wages (Slicher van Bath 1963). This develop¬ 
ment encouraged the extension of cultivated areas, investment in 
agricultural infrastructure, consolidation of landholdings, and tech¬ 
nological change in Sweden (Thomas 1941) as it did elsewhere in 
Europe. For example, the real wage paid to men for day labor during 
the summer harvest season stagnated from 1732 to about 1850 in 
Sweden (App. table Al). From 1851—55 to 1876—80 freer trade in 
agricultural goods contributed to a fourfold increase in Swedish grain 
exports, with oats accounting for 90 percent of these exports by the 
end of the period (Fridlizius 1957, table 13). But with the European 
grain crisis of the 1880s, Swedish grain exports collapsed and were 
only one-tenth their former level by 1900, while butter exports in¬ 
creased fourfold, replacing grains as a proportion of total exports 
(Fridlizius 1957, table 64). This redeployment of agricultural re¬ 
sources was induced largely by the decline in the world price of grains 
relative to the price of butter and other animal products. Protection 
measures adopted by Sweden did not insulate the domestic economy 
from this price shock. In the 1860s and 1870s more than half of 
Sweden’s national income from agriculture was generated by the pro¬ 
duction of crops rather than livestock or dairying, whereas the share 
of crops had fallen to 30 percent by 1906—10 (Thomas 1941, table 
22). The primary cause for the realignment in world relative prices 
was the opening up of fertile new' lands in the midwestern United 
States and the Russian steppes, combined with the declining cost of 
transportation from these regions to European markets. 

Technological change reinforced these pressures of world relative 
prices to encourage the reallocation of production from grain to 
animal husbandry. Improvements in livestock breeding facilitated a 
doubling of milk yields per cow and increased butterfat percentages 
in the last quarter of the century (Lindahl, Dahlgren, and Kock 1937). 
The efficiency with which milk was processed into butter and cheese 
also increased, through the improvement in separator technology 
pioneered in Lund. Refrigerated transportation also widened the 
market for Sweden’s dairy and animal products. 
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Dairying and milk processing were “women’s work” in Sweden, as 
they often are in other societies (Snell 1981). The production of 
grains and root crops, on the other hand, placed greater demands on 
male labor, and integrated beef and pork production probably also 
employed mainly male labor. The hypothesis tested later in this paper 
is that [trice movements that raised the profitability of butter produc¬ 
tion relative to grain production should have been associated with 
improvements in women’s agricultural wages relative to men’s, other 
things being equal. 

At the other extreme of the employment continuum, forestry and 
sawmills hired almost exclusively male workers. 1 Although data are 
scarce on employment in this sector, Swedish output grew rapidly in 
the laltet half of the century. By 1861—65, exports of boards and 
planks had equaled in value Sweden’s traditional exports of iron and 
thereaf ter increased fourfold before subsiding at the turn of the cen¬ 
tury. With high wages but few jobs for women, the nothern tier of 
counties that relied heavily on forestry may have resembled isolated 
mining communities, which have been shown to exhibit unusually 
high levels of fertility iti Europe and America (Haines 1979). 

In the industrial labor force women were concentrated in textiles 
ami food processing. The 1920 census, the first to distinguish the 
industrial distribution of the labor force by sex, reported that three- 
fourths of the women employed in industry worked in these two 
sectors, which accounted for 28 percent of the entire industrial labor 
foicc. ’ Wage series in these two sectors show that women’s wages were 
between 60 and 67 percent of those received by males. By contrast, 
day agticulttiral wages for women were as low as 50 percent of male 
wages in 1870—74 (see table I), when oats exports were still expand¬ 
ing, but they had increased to 61 percent of male wage levels by the 
start of World War I. Annual cash contracts for labor, referred to as 
agricultural servants, experienced an even sharper rise of one-half in 
the wage level for women relative to men: female servant cash wages 
were 12 percent of male servant wages in 1870—74 but had reached 
68 percent of male levels by 1910—14 (table 1). Given the greater need 
for year-i ound labor to tend livestock and perform dairy functions, it 
is also not surprising to note that the premium paid for casual day 


' According io I he 1920 census, women represented only 1 percent of the industrial 
workers in lumbering (Thomas 1041, table 43) 

’ lu die 1020 census, women accoumed ior31 perceni of the industrial labor force in 
food processing, while iheii share in textiles was 76 percent (Thomas 1941. table 43) 
Wages for women relative to men did not change notably in textiles from the start of 
the series in 1865 to 1914 In food products women's hourly wage increased from 60 
percent of male levels in 1800 io 65 percent in 1914 (Bagge, Lundberg, and Svennilson 
1933, vol. 1. table 18) 
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labor during the summer grain harvest diminished relative to the 
wages paid to day labor during the wmter season. All of these devel¬ 
opments in the structure of wages in agriculture mirror the increased 
opportunity value of women's time, particularly in year-round em¬ 
ployment associated with dairying. 

Technical rationalization and mechanization of Swedish agriculture 
after 1850 reduced the demand for tenants, servants, day laborers, 
and other categories of hired help. But the movement of landless 
workers from agriculture to the cities and, after I860, the upsurge in 
emigration, mainly to the United States, also reduced the supply of 
agrit ufturaf labor. Wages of male day labor deflated by the price of 
the principal food grain, rye, doubled from J870 to 1914, while cash 
wages of male servants increased even more rapidly (Bagge et al. 

1 983; Jorberg 1972). The Swedish population had managed to adapt 
to the dislocating effects of the changing relative values the world 
economy attached to its exports of grains and animal products, princi¬ 
pally butter. This adjustment in the mix of agricultural production 
occurred at a time of large-scale emigration and of increasing urbani¬ 
zation and industrialization (Karlslrom 1980; Mosher 1980). How did 
these different but intertwined developments affect the level of fertil¬ 
ity in Sweden? 

The data analyzed to consider this question relate to the 25 counties 
of Sweden and the city of Stockholm for six decadal cross sections 
from 1860-64 to 1910-14 (see fig. 1). National demographic price 
arid wage series are summarized in table 1. In this half century the 
Swedish total fertility rale decreased by 28 percent after increasing 
somewhat in the 1860s and 1870s. Cohort data constructed for 
women born between 1840 and 1890 indicate that cumulative fertility 


decreased by 43 percent across these birth cohorts (Hofsten and 
Lundstrom 1976, table 2.2). Mortality among children up to age 10 
continued its secular decline, falling in this 50-year period by 52 per¬ 
cent, though not without increasing as in 1880—84. The marital fertil¬ 
ity rate followed much the same path as the total fertility rate, declin¬ 
ing by 26 percent from 1860 to 1910. Thus dramatic changes in 
marriage patterns were apparently not responsible for the noted na¬ 
tional trends in total fertility rates, although beneath the national 
figures regional changes in marriage patterns may have responded to 
different economic and demographic conditions. 

The objective of the empirical analysis is to test three hypotheses: 
(1) Do changing local prices of basic traded agricultural commodities 
help to explain the cross-sectional variation and temporal changes in 


6 The summer clay wage for males was 53 percent more than the winter day wage in 
1870 but only 34 percent greater by 1915 (Bagge et al. 1933, vol. 1). 
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male and female wages in Sweden? (2) Do women's and men’s wages 
exhibit the anticipated association with total and age-specifk fertility 
rates, controlling for child mortality, urbanization, and the rising gen¬ 
eral standard of living? (3) Given the likelihood that factors other 
than shifts in the demand for labor influenced the observed wages of 
men and women in agriculture (i.e., supply responses via migration, 
differential female labor force participation, cumulative labor market 
experience), do time-series variations in commodity prices, plus the 
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smaller interregional variations in prices, provide a satisfactory j n 
strument to estimate without simultaneous equation bias the demand 
determinants of fertility operating through male and female wages} 
This final step in the analysis is a severe, but nonetheless appropriate, 

, () f ,he gist of the demand hypothesis, which assigns importance to 
‘ v U -iation in market prices and thereby to changes in sex- 
ex( * " ' (es i„ governing the historical fertility decline in the 

u K ‘v More specihcalh, the study seeks to answer the question 
' hr identified demand-induced increase m womens wages 

decrease in .Swedish total lertilitv rates. 


Empirical Results 

I he , egressions in table 2 report the linear ordinary leasr squares 
(OI.S) relationships between three groups of conditioning variables 
and three dependent variables: (I) the total lertilitv rate, (2) the wage 
rate for male agricultural day labor, and (3) the female agricultural 
wage rate relative to the male. 'Ihe first set of regressions includes as 
explanatory variables only the relative price of the main categories of 
agricultural outputs: the price of butter relative to rye, the price of 
pork relative to rye, and a dummy variable equal to one when the 
pork price is not locally reported as a basis for settling tax obligations 
in the county.' A higher relative price of butter is associated with a 
higher male wage level, a higher lemale-to-male wage ratio, and a 
lower fertility rate, as hypothesized. Pork prices relative to rye are 
associated with higher male wages, lower female to male wages, and 


' Seven <>l the 25 counties did not report poik prices throughout ihe period 1860- 
15)10 These counties were regionally grouped m the north (AC, \V, Y) and the south 
(K. J,, M, N) (see fig, I). The dummy variable foi these regions denotes ihe irrelevance 
ot poik prices Co piodurticm in regions where pork is economically unimportant One 
county also does nol report butter ot rve prices in this period (K, or Blekinge) lor no 
obvious reason, perhaps the historical records ol the government tax settlement prices 
weie destroyed Regal dless, the price of butter relative to rye for tins coastal tounty i> 

' imputed" as an unweighted average ol those repotted by the neighboring counties o 
Kristianstad and Kalmar. These relative price series are 5-ycar averages from [ortierj 
(1972, 2 557—72). T he addition of other grain prices, such as for oats, barley, anc 
wheat, added little explanatory power since all gram prices are highly correlaled. Add 
ing cither price series lot. e g, beef, cows, arid pmewood did not appeal to change th 
relationships noted here but required the estimation of additional, more inclusiv 
dummy variables for counties with missing price variables. Butter and rye are the mo 
universally repotted prices Virtually all price series tend to vary less across counti. 
during ihe last half of the nineteenth century as transportation improved and loc 
markets liecaine more integrated. For example, the cross-county coefficient of variant 
for the price of rye dec reased ftom .12 in 1860-64 to .07 in 1910-14. butler trom .1 
to .06, pork from . 19 to . 15, and male day wages in agriculture from .24 to. 16 ((orbe 
1972.2:222-29). 
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f higher fertility. The same pattern holds for those regions in the south 
..; and north where the production of pork is sufficiently uncommon 
v that pork is not an accepted basis for settling government tax obliga- 
lions. 

h The second set of regressions adds as explanatory variables the 
p earliest data on the location of employment opportunities outside of 
agriculture that might have had an effect on both the overall level of 
wages and the ratio of female wages to male wages. As we noted 
earlier, women were disproportionately employed in textiles and food 
processing, and both industries were geographically concentrated in a 
few counties. 8 Specifically, the percentage of a county’s labor force in 
textiles and food processing in 1896 and the percentage in forestry in 
1910 are the variables added to the second set of regressions in table 
2. As anticipated, forestry bids up the level of male agricultural 
wages, depresses female-to-male wage ratios, and is associated with 
higher overall fertility. Industries with predominantly female em¬ 
ployment are associated with higher female relative wages and lower 
fertility, but these industries are not statistically significantly related to 
male wage levels. 

The final set of regressions in table 2 adds the urban proportion of 
the population and the child mortality rate from birth through age 
10. As with the location of textile and food processing industries, it 
might be argued that urbanization is facilitated by the geographic 
distribution of industrial investment and of investment opportunities 
in agriculture, which should themselves respond to the local level and 
structure of wages. It is common to regard such a lagged feedback 
response of urbanization and industrialization to the wage structure 
as of secondary importance compared with the direct effect of urbani¬ 
zation and industrialization as determinants of wage and price levels. !l 


H T extiles were concentrated in counties M, O, P, A. and E and food processing in A. 
<), and M (sec fig I and Thomas et al. 1941) The location of rivers as a source of power 
is attributed a role in the placement of textile factories m this period and may therefore 
have been independent of the local conditions of labor supply {Hcckscher 1954) 

'* A relevani exception to this approach is the work of (ioldin and Sokoloff (1982), 
who hypothesize that textiles developed in the northeastern United Stales *n response to 
the relatively low female (and child) wages in that region m the early nineteenth cen¬ 
tury. If low female wages influenced the location of textile investments in Sweden in 
this period, we might expect to sec a negative sign for the coefficient on textiles and 
food processing (or biased downward) in the equations explaining female to male 
wages and male wage levels. One could also suspect that internal migration from rural 
to urban areas and then abroad affected wages and governed the rate of urbanization. 
The unbalanced sex ratio of migration among counties and overseas illustrates how the 
changing economic opportunities for tnen and women in Sweden were in Hux in this 
period (Thomas 1941). Future work should explore the origins of this migration pro¬ 
cess as jointly determined with marriage patterns and marital fertility. Haines (1979) 
speculates on some of these interlinkages. 





Textile and -45 7 126 337 -21 0 0687 667 4.42 

food pro- (4 26) (1-37) (2 39) (1.83) (66) (4.68) )4 32) 

cessing 




' ‘37 



1 1^8 JOURNAL OF POLITICAL ECONOMY 

However, fertility is widely observed to be lower in urban than in 
rural populations in Sweden as elsewhere, and it is not possible in 
Sweden to measure the rural-urban differences in relative prices of 
child support and in child wage opportunities that might help to 
account for rural-urban differences in fertility. The relative prices of 
agricultural commodities could also differ systematically between ur¬ 
ban and rural areas, such as butter's having a higher price relative to 
rye in urban than in rural areas.Including urbanization as a prede¬ 
termined variable affecting wages may thus reduce the apparent ef¬ 
fect of commodity prices and proxy unmeasured changes in relative 
prices, incomes, and work environment that contributed in urban 
areas to the adoption of a smaller family size goal. 

It is common to view child mortality as determined by the local 
economic environment and a region’s health technology, but this view 
neglects the possibility that causation may flow in the opposite diret- 
tion. In other words, the level of fertility may itself have a direct effect 
on child mortality, and factors omitted from this study could also be 
partial!) responsible for die levels of boih fertility and mortality. 11 
Consequently, some proportion of the positive covariation anticipated 
between child mortality and fertility may he due to omitted factors 
that affect fertility and thereby influence child mortality or that in¬ 
fluence both hv unspecified mechanisms. For these reasons, fertility 
and wages are first treated as a function of only the least controversial 
determinants, namely, the relative commodity prices. Then the list of 
explanatory variables is augmented to include industrial structure, 
tithanization, and child mortality. Pooling six decadal observations 
lor 2.5 county areas of Sweden assumes independence of observa¬ 
tions, and consequently the estimates reported here do not exploit the 


M\ .iIH'iition was drawn u> tins possibility by members <>l a seminar at the Eco- 
nomic History Institute at die University of l.imd in October 1983. Such a tegional 
pattern in ionimorlily prices could be attributed to higher urban incomes in conjunc¬ 
tion with the gi eater ituomc elasticity of demand for butler and animal piodnets 
compaied with that lot food glams. These differences m local tomsumption patterns 
might affect relative prices if transjxirlation costs lemained significant for more perish¬ 
able animal pioducts. Uihanization may. therefore, account for some of the simple 
association between higher butter pikes and lowei fertility il the urbanization process 
were botfi instigating the movement in relative commodity prices and motivating the 
seiului decline in fertility by mei easing omitted child price variables. 

Long-standing regional differences in mortality m Sweden are documented back 
to the early eighteenth century (Hcckscher 1954; Utterstrom 1965) and stressed by 
Sundharg (1907) in his compilation of data (sec table A2). Eastern Sweden reported 
substantially higher moiiahty, particularly among children, than did western (anc 
southern) Sweden Both Sundbarg and Utterstrom attribute the higher fertility o 
eastern Sweden to earlier marriage, more universal incidence of marriage, and iiighe 
rates of illegitimacy. Portions of the less densely settled northern counties reportcc 
idatively low mortality, just as the city of Stockholm until 1914 reported the highes 
mortality levels in the country. 
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likely covariance structure in the disturbances. Alternative error- 
component estimators for data in this form might, therefore, alter 
our hypothesis tests and standard errors, though not necessarily af¬ 
fect point estimates. 12 

The total fertility rate and the seven age-specific birthrates that sum 
up to (one-fifth) the total fertility rate are each regressed on the child 
mortality and urbanization variables and on the predicted values of 
the male wage and female-to-male wage ratio obtained from the first- 
stage regressions (2) and (3) in table 2. These instrumental variable 
estimates of the effects on fertility of the male wage level and wages of 
women relative to men are identified in table 3 by the minimal exclu¬ 
sion restriction of the commodity price series, whereas those in table 4 
are overidentified by the further inclusion of the industrial composi¬ 
tion of the labor force, as well as urbanization and child mortality in 
the first-stage regressions. These instrumental variable estimates of 
demand effects of wages are statistically consistent if the instru¬ 
ments—the commodity prices and industrial structure variables—are 
uncorrelated with the statistical error in the fertility equation. The 
inconsistent OLS estimates of the fertility equations, where women's 
wages are likely to reflect labor supply responses also, are reported 
for comparison in Appendix table A3. 

Child mortality is significantly associated with total fertility and with 
all age-specific fertility rates from age 20 to age 39 in tables 3 and 4. 
The proportion of the population living in urban areas is associated 
with lower fertility among women over age 25, but, interestingly, the 
birthrate among teenagers is higher in urban areas, suggesting earlier 
marriage or at least earlier childhearing. The real wage rate for males 
in agriculture is not related to the overall level of total fertility, but it is 
significantly associated statistically with the age pattern of childbear¬ 
ing. A higher male real wage is associated with higher birthrates 
among younger women (aged 15—29) and lower birthrates among 
older women (aged 35—49). If these estimates measure the real wage 
effect on the life-cycle process of family formation, then higher male 
wages accelerate entry into marriage and childhearing but do not 
significantly raise completed fertility, as approximated here by the 
period total fertility rate. A rise in wages does, however, shorten the 
period between generations and thereby accelerates slightly the rate 
of population growth. The standard Malthusian hypothesis that in¬ 
creases in real wages contribute to earlier marriage appears to be 

11 Within-region (i.e , fixed-effect) estimators or first-differenced specifications of the 
model would not fie attractive, however, since they neglect informative patterns of 
persistent cross-sectional variation in fertility levels and economic and demographic 
conditions. For example, differences in the extent of forestry activities persist as do the 
higher levels of fertility in the northern counties that depended on tins industry. 
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confirmed by these Swedish data, just as Heckscher (1954) noted that 
marriage rates in Sweden responded sensitively to the harvest cycle 
and Ohlin (1955) observed that opening up new frontier lands in¬ 
creased wages in Scandinavia, attracted immigrants, and raised fertil¬ 
ity. Of greater novelty is the finding that the Swedish population in 
this period was able to restrain fertility within marriage to compensate 
fully for the pattern of earlier marriage and childbearing that oc¬ 
curred in high-wage areas. 

The level ol female to male wages exerts a depressing effect on 
birthrates at all ages except among teenagers. This effect is spread 
quite evenly over the childbearing years, suggesting that the labor 
market opportunities of women do not simply delay entry of women 
into marriage, as assumed by Kussmaul (1981) for early modern En¬ 
gland, but may also exert a moderating effect on fertility that might 
be difficult to deduce by traditional demographic methods, which 
locus on parity-specific control as a deviation from “natural fertility" 
(Coale 1973). 

Table 4 reports the preferred estimates, these are not qualitatively 
different from those in table 3, although the inclusion of the indus¬ 
trial composition of (he labor force, urbanization, and child mortality 
m the list of instruments used to estimate the wage variables tends to 
reduce the magnitude of the coefficients on the female to male wage 
and child mortality variables and also to improve the precision of 
these estimates. 1 ' If one translates the child mortality coefficient into 
an estimate of the partial adjustment of births to a child death, tfie 
estimates m table 4 imply a replacement ratio of .65. Thus, about two- 
thirds of the increase in population growth caused by the rapid de¬ 
cline in child mortality in this half century appears to have been offset 
by a compensating decline in fertility, but since the replacement rates 
were larger for birthrates of younger women, one may assume that 
expectations as well as realized child mortality may have influenced 
the observed pattern of Swedish fertility. 11 

1 able 5 simulates the implications of the estimates from tables 3 
and 4 for the actual changes in the explanatory variables (from table 

1 l alilt II shows (hat this augmented list ol instiunit-ms accounts lor neaily tout 
nines the vat lame- in the lemale-to-male wage ratio ant! a third more of the variance in 
the male wage level 1 his improved explanatory power of the instruments may account 
lot the gteatei precision of the estimates ol the coefficient un the temale-lo-male wage 
ratio in the fertility equations reported in table 4 compared with those in table 3 
" A derivative of births with rcs|tect to child deaths undet age lit is calculated as 
ilb'filU — bUF + b('.M), where b is the cocftii lent estimated for the child mortality rate, /■' 
is the sample mean ol the fertility rate, and CM is the sample mean of the child 
mortality tale ji.e., 0.198) (Lee and Schultz 1982). Thus the replacement ratio for the 
total tertihtv rate troni table 4 is 0 645 = 3.124/|4,227 + 3,124(0,198)]. F 01 the first six 
age-specific birthrates the replacement rates tail troni 1.16, 1.14,0.82,0 69. and 0.36 to 
0 14. 
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L) and compares these predicted changes in fertility with the actual 
changes occurring over the half century studied. The estimates in 
table 3 overpredict the decline in the total fertility rate by 23 percent, 
while the estimates in table 4 underpredict the decline by 14 percent. 
The preferred estimates from table 4 overpredict changes in age- 
specific birthrates for the young, for whom fertility rates increased in 
this period, and underpredict slightly the fertility declines after age 
25. Since the same explanatory variables are included in each of the 
fertility regressions, the sum of the age-specific coefficients on any 
explanatory variable is equal to one-fifth of that variable’s coefficient 
in the total fertility rate equation. A consistent adding-up of predicted 
effects is therefore maintained. 13 

The effect of these four explanatory variables on marital fertility 
and the proportion married is explored tentatively in a final set of 
estimates. Since marital fertility rates are available only from 1870, 
the analysis is limited to 1870—1910. The dependent variables are the 
total fertility rate, as before, the marital fertility rate, and a propor¬ 
tion married equal to the ratio of total fertility to marital fertility (see 
table 1 for definitions). This residually constructed proportion mar¬ 
ried also captures changes in illegitimate birthrates (which were rela¬ 
tively constant in this period) and in the age composition of married 
women that are ignored in the reported marital fertility rate. 

The marital fertility regressions are shown in table 6, first in linear 
form as before and then with the three dependent variables in 
logarithmic form. This logarithmic specification facilitates the decom¬ 
position of the eff ect of explanatory variables on the total fertility rale 
presented in the third portion of table 6. Child mortality exerts its 
effect on the total fertility rate mainly (93 percent) through depress¬ 
ing marital fertility, implying that the fertility response is occurring 
through hedging and replacement behavior within marriage, and not 
by anticipations that influence the age at marriage as observed in 
Taiwan and Korea (Schultz 1980; Lee and Schultz 1982). Urbaniza¬ 
tion also exerts 85 percent of its effect on fertility through depressing 
marital fertility rates. Male wages appear to increase the proportion 
married, as was surmised from the male wage effect on the age- 


15 Alternatives to age-specthc birthrates to summarize the age schedule of fertility 
should also be investigated. Brerkenridge (1983) proposed such a three-parameter 
representation of the age schedule of fertility and fit these parameters to the Swedish 
data at the county level tor the same decadal intervals analyzed here Using the IV 
specification of the model m table 4, urbanization and die male real wage rate are 
positively associated with the woman's fertility level parameter ((3,). The female relative 
to male wage and urbanization are associated with later ages of childbearing (|i, r ) given 
the level, whereas child mortality and male wages are associated with childbearing at 
young ages 
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specific birthrates. The femalc-to-male wage ratio affects only mantai 
birthrates, confirming the earlier inference that relatively higher 
wage opportunities for women do not only deter then early mai rmge, 
but rather also reduce their later reproductive performance, perhaps 
by means of relatively uniform practice of traditional methods of 
birth control throughout marriage. 


Conclusion 

In many high-income countries the wage received by women relative 
to that received by men has increased in the last 100 years (Schultz 
1981, table 7.3). Economists hypothesize that this aspect of modern 
economic growth increases the opportunity cost of children and may 
he a contributing factor to the secular decline in fertility. But this 
trend in female to male wages may itself be caused by women's em¬ 
ployment and human capital investment choices, which have evolved 
simultaneously with the reduction in reproductive goals. The prob¬ 
lem addressed in this paper is how to identify the aggregate demand 
and technologically induced effects on male and female wages that 
can then he examined as an exogenous price of women's time to help 
explain cross-sectional and time-series variation in fertility, 

The approach adopted here is to estimate how the abrupt world¬ 
wide change in the relative prices of grains and animal products that 
occurred in the 1880s filtered through the largely agricultural 
Swedish economy to affect the labor market. An effort was also made 
to hold constant real male wages, child mortality, and urbanization, all 
of which were changing rapidly in the period 1800-1910. 

Specifically, the disappearance of exports of Swedish oats and the 
rapid growth in exports of butter from 1870 to 1900 can be traced to 
the decline in the price of cereals relative to livestock products. It is 
hypothesized by English historians that earlier swings in the price of 
livestock products relative to grains (from the sixteenth to the eigh¬ 
teenth centuries) contributed to swings in the relative employment of 
women in farm service, with repercussions on their age at marriage 
and hence aggregate fertility (Kussmaul 1981; R. Smith 1981). The 
late nineteenth-century improvement in the price of butter relative to 
grains may have analogously contributed to an improvement in wom¬ 
en's wages in Sweden relative to men’s and thus to a more rapid 
decline in fertility than would otherwise have occurred in Sweden. 
Other developments may have worked in the opposite direction. The 
northern counties of Sweden were involved in a rapid expansion of 
exports of timber from the 1860s to 1900, and the predominance of 
male workers in lumbering provides an explanation for the persis¬ 
tently high fertility in some of these areas until well into the twentieth 
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century. Conversely, a few southern and eastern counties developed 
textile and food-processing industries that provided employment for 
76 percent of the female industrial labor force in 1920. These geo¬ 
graphic patterns in location of industry are assumed here to be exoge¬ 
nous to the fertility determination process, dictated primarily by natu¬ 
ral resource endowments, transportation, and early river power for 
mechanization. 

The first stage in the analysis confirms that where the price of 
butter relative to rye (the basic food grain in Sweden) was higher, 
female to male wages were higher and fertility lower. The reverse was 
true for areas where pork prices were relatively high. Local industrial 
employment in forestry is associated with higher male wages; con¬ 
versely, greater employment in textile and food processing is linked 
to higher female to male wages. The second stage in the analysis 
reports instrumental variable estimates, implying that a quarter of the 
decline in the Swedish total fertility rate from 1860 to 1910 can be 
explained by the 10 percent rise in the female-to-male wage ratio. 
Another quarter is associated each with the 50 percent reduction in 
child mortality and with the increase in urban share of the population 
to one-quarter. The doubling of real wages paid to male day workers 
in agriculture is not associated with a change in total fertility rates but 
is estimated to contribute to earlier marriage and higher fertility 
among women under age 30 and, consequently, to lower fertility 
among women older than 30. Cohort data would be needed to exam¬ 
ine the underlying dynamics of the family formation process during 
this period, an analysis beyond the scope of this paper. When the real 
wages of men, child mortality, and urbanization are held constant, the 
aggregate county-level data for this 50-year period in Sweden suggest 
that the appreciating value of women’s time relative to men’s played 
an important role in the Swedish fertility transition. 

Appendix 

Table A1 summarizes long-term tiends in prices, wages, and derived mea¬ 
sures of relative wages and prices in Sweden. Table A2 defines the regional 
administrative units used in the empirical analysis and shown in figure 1. 'flic 
regional codes are also indicated for the wage series by Bagge et al. (1933). the 
price series by Jorberg (1972), and the demographic groupings by Sundbarg 
(1907). The primary purpose ol this Appendix is lo restate the estimation 
issues that arise in the paper and to indicate the source and likely direction of 
the bias in estimating the effect of female wages on fertility by ordinary least 
squares (OLS). The asymptotically unbiased instrumental variable (IV) esti¬ 
mates provided in the paper are then briefly contrasted with the inconsistent 
OLS estimates reported below. 

County demand for female labor is assumed to depend on the wages of 
females and males, a vector of prices of alternative agricultural outputs and 
inputs (P A y for which male and female labor have different comparative 
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TABLE A1 

Average Price Levels and Number of Swedish Counties (in Parentheses) 
Reporting Prices for Selected Commodities and Years 



1732 

1758 

1815 

1860 

1913 

Commodity 

Grains* 

Rye 

.45 (HI) 

1.11 (28) 

7.50 (30) 

8 89 (30) 

9.63 (30) 

Wheat 

68 (1 1) 

1.32 (13) 

9.84 (13) 

13 2 (24) 

11 2 (23) 

Barley 

.41 (27) 

.97 (28) 

5.17 (30) 

7 75 (30) 

8.29 (SO) 

Livestock products: 
Butter 

.06 (21) 

.11 (23) 

.83 (26) 

1 26 (29) 

2.01 (29) 

Pork 

.04 (16) 

.08 (17) 

.55 (17) 

78 (20) 

1 24 (22) 

Beef 

.02 (15) 

04 (10) 

29 (16) 

53 (18) 

1.10 (19) 

Lumber: 

Pinewund 

.08 (2) 

.16 (2) 

98 (29) 

53 (18) 

1 10 (19) 

Labor. 

Day wages male 
summer/harveM 

07 (10) 

09 (20) 

68 (30) 

1 07 (29) 

2.74 (29) 

Day wages of hand 
and two horses 

12 (8) 

17 (19) 

1 61 (26) 

2.89 (27) 

6.65 (27) 

Maximum numbei 
ol legions 

27 

28 

30 

30 

30 


Denver! Measures of Puce Levels and Rea) Wages 


Index ot cost of living* 
Real wage index for 
agricultural workers 


79 

too 

132 

(20-yeat average)t 
Implicit real agitculutial 

100 85 

86 

97 

165 

daily wagef 

Priie of butter relative 


86 

107 

208 

u> i ye 

13 .10 

1 1 

14 

21 


Soi!R( 1 — fnrhrtg (1972), sol I. and vatoms tables hum veil 2 (.v\< of living index from Mtrdal (IDI'I) 
• I able XfX 11, p 150, lot IH15-20, 1KM). b-t. 1*110-M 

t f .ifilc XIX 8, p 1144. lot 1735-54, 1755 -74, 1H05-24, 1855-74. |K9f»-l<M4 
t ( aUulaied bv dividing annual daily male wage almve by iovi ul living index 


advantages, local natural resource endowments that account lor the location 
of industrial employment (!) with a distinctly different (emale-to-male em¬ 
ployment ratio (i.e., forestry, textiles, and food processing), and a modeling 
and measurement error’ 

L rf = d(w,. u’ m , P A . /, fi). (Al) 

County supply of female labor is also a function of lemale and male wages 
and a second error that probably contains many omitted price, home pioduc- 
tton, culture, and taste factors: 

L, = s(w,. w m , e 2 ). (A2) 

The local county wage for women is the one that equates their labor supply 
and demand. The reduced form for female wages then can be written 

wj = w f (P A , /, e,. e 2 ). 


(A3) 
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Male labor supply and demand are represented analogously with the reduced 
form for male wages also depending on l’ \ and / 

Our final interest attaches to the demand for children or a fertility equa¬ 
tion: 

T = / (-g_. X. m). (A4) 

where A' represents fertility determinants other than wages, such as urbaniza¬ 
tion and child mortality, plus a fourth error encompassing omitted (actors 
such as culturally mediated tastes. Note that the female wage is specified 
relative to the male wage to reduce the likely collmearnv between wages of 
men and women 

I he statistical problem with estimating the fertility equation (A4) directly is 
that tts ertor will undoubtedly be correlated with that in the labor supply 
equation, and probably inversely, namely. E(e -< 0. Consequently, ob- 
served wages will be correlated with and direu Ol.S estimates ol equation 
(A4) will be inconsistent. In this papei, V ( and 1 are used as lVs to obtain 
consistent estimates of (A4) that are free of die simultaneous equations bias 

Since women's market labor supply tends to be relatively elastic (computed 
with men’s) with respect to own wage, one expeits demand shocks 10 elicit 
relatively larger compensating labor supply variations for women than for 
men (Schultz 1081: Killmgsworth 1983) 1 herefoie. Ol.S estimates of equa¬ 
tion (A4) would likely understate the effect of women's wages on fertility 
computed with the IV' estimates, whereas male wage ellects might not be 
seriously biased hv the endogenous lalnir supply behavior ol men 

Table AS reports inconsistent OLS estimates ol equation (A4) to test out 
expectations. 1'he lemale-to-male wage rate has no statistically significant 
effect on the total fertility rate according to these Ol.S estimates, though its 
association with fertility for ages 15-29 remains tnvetse and statistically 
significant at the 5 percent level of confidence, [ he OLS and IV estimates ol 
the male wage effect arc sinulai; they remain latge and positive lor fertility 
rates among women less than age 30 and negative lor birthrates ol women 
over age 40. The apparent effects of child mentality and titbani/ation are 
increased by the neglect of simultaneous equations bias. But pumatilv it is 
women's relative wage effects on fertility that are changed (reduced overall 
and slulted to the young ages) by neglect of the anticipated simultaneous 
equations bias. It is precisely this el led of female labor supply behavior con¬ 
cealing the magnitude of the fertility response to slid ting demands foi female 
labor (hat would have obscured from direct (OLS) view the time-sottes esi¬ 
de nte that secularly rising female wages contributed 10 the Swedish feitilitv 
tiansition To the extent that the uncompensated female laboi supply own- 
wage effect outweighs the cross-price effect of husband's wages ( | Smith 
1980), increased female labor force participation has reduced the observed 
increase in Swedish female relative wages. 








r- oo 


5° 


1 


Kalmar North (Central west 

Kalmar South East) West 

Isle of Oland , west 

Gotland, Isle 



co — 


-a jd 

!N^Nl>00CClOW“ 

—« — —. — —« c>JC'IC'4CMC^ 




U V U iJ £ 


/> «/> -i 


.,..000 


x. O cu 


x: 

be 

c 

X 

CQ 


«wQ<<auH 


CJ Q 

Jcr^XN><« 



1151 






■FERTILITY TRANSITION 


"53 


References 

Ankarloo, Bengl. “Agriculture and Women’s Work: Directions of Change in 
the West, 1700-1900.” J. Family Hist. 4 (Summer 1979): 1 1 1-20. 

Bagge, Costa; Lundberg, Erik; and Svennilson, Ingvar. Wages m Sweden, 

1860- 1930. 2 vols Stockholm: King, 1933. 

Beckei, Gary S. “An Economic Analysis of Feitility." In Demographic and 
Economic Change in Developed Countries. Princeton, N J.: Princeton I'tnv. 
Press (for N.B.E.R.), 1900. 

Breckenridge, Mary B. Age, Time mid Fertility Applications of Exploratory Data 
Analysis New York: Academic Press, 1983 
Coale, Ansley J. "The Demographic Transition Reconsidered.” In Interna¬ 
tional Population Conference, vol 1. Leige, Belgium. Internal. Union 
Scientific Study of Population, 1973. 

Fndh/ius, Gunnar. Swedish Corn Export in the Eire Ttade Era: Patterns in the Oats 
Trade, 1830—1880 Lund: Berlingska Boktrvc kerut (lot Inst. Eton. Hist), 
1957 

Goldin, Claudia, and Sokoloff, Kenneth. “Women, Children, and Industriali¬ 
zation m the hai ly Republic: Evidence from the Mauufac Hu ing Censuses." 
J. Econ. Hist. 12 (December 1982): 741-7-1. 

Haines, Michael R. Fertility and Occupation: Population Patterns in liidustria/tzci 
lion. New York: Academic Press. 1979. 

Heckscher, Eli F. Am Ktoiitwiir History i if Sweden. Camhudge, Mass.. Harvard 
Univ. Press, 1954. 

Holstcn, Erland, and Lundstrom, Hans. Swedish Population History Main 
Trends fiom 1730 to 1970. Stockholm Staristika Centralhvran, 197<> 
Jorherg, Lennait. A Hi.stoiy of Puces in Sweden, 17)2-1919 2 vols. I.und: 
Glcerup, 1972. 

Kailstrom, l'. “Urhani/ation and Industualization: Modeling Swedish De- 
inoeronomu Development from 1870-1974.” Working Paper no. 80-54 
Inlernat. Inst. Applied Systems Analysis. April 1980, 

Killingsworth, Maik R. Labor Supply Cambridge: Cambridge E'mv lhess, 
1983 

Knodel, J. ” I he Influence oi Cliild Mortality in a Natural Fertility Setting." In 
Sutural Fertility: Patterns and Determinants of Natural Fertility, edited bv Henri 
Leudon and Jane Menken. Letge, Belgium: Ordma, 1979. 

Kussmaul, Ann. Servants in Husbandry in Early Modern England. Cambridge: 
Cambridge Univ. Press, 1981. 

l.ee, B. S., and Schultz, T. Paul. “Implications ol Child Mortality Reductions 
for Fertility and Population Growth in Korea. "J. Earn. Development 7 (July 
1982): 21-44. 

Lindahl, Erik; Dahlgten, Etran; and Kock, Kami. National Income of Sweden, 

1861- 1930 2 vols. Stockholm: Norstedt (for Stockholm Cniv., Inst. Stic 
Set.), 1937. 

l.ockrtdge, Kenneth A. The Fertility Transition in Sweden - A Preliminary l.ooh at 
Smaller Geographic Units, 1833—1890. Demographic Data Base. I’mea. Swe¬ 
den. Univ. Umea, 1983. 

Malthus, Thomas R. An Essay cm the Principle of Population, as It Affects the 
Future Improvement of Society. London: Johnson, 1798. 

Matthiessen, P. C. "The Limitation of Family Size in Denmark " European 
Fertility Project. Mimeographed. Princeton. N.J.: Princeton Univ., 1983. 
Mincer, Jacob. “Market Prices, Opportunity Costs, and Income Effects.” In 
Measureirfent in Economics: Studies in Mathematical Economics and Econometrics 



* >54 JOURNAL OF POLITICAL ECONOMY 

in Memory oj Yehuda Crunfe/d. by Carl F. Christ et al. Stanford, Cali/.: Stan¬ 
ford L'mv. Press, 1963. 

Mosher, William D. “Demographic Responses and Demographic Transitions: 
A (ia.se Study of Sweden." Demography 17 (November 1980): 395—412. 

Mosk, Carl. Palnarcky and Fertility: The Evolution of Natality in Japan and Swe¬ 
den. 1880-J960. New York: Academic Press, 1983. 

Myrdal, Cunuar. The Cost of Living in Sweden, 1830-1930. London: King (for 
l.'niv. Stockholm, Inst. Stic. Sci.), 1933. 

Ohlm, (>. “ The Positive and the Preventative Check.” Ph.D. dissertation, Har- 
vatd I'niv., 1955. 

Schultz, T. Paul. "An Economic Interpretation of the Decline in Fertility in a 
Rapidly Developing Country.” In Population and Et anomic Change in Devel¬ 
oping Countries, edited by Richard A. Easterlm. Chicago: Univ. Chicago 
Ptess (for N.B.K.R ), 1980. 

-. EiommiKs of Population. Reading, Mass.: Addison-Wesley, 1981. 

Slit her can Hath, B H. The Agianan History oj Western Europe, ad 5 00—1830. 
London: Arnold, 1963. 

Smilh, Adam. An Inquiry into the Nature and Cruises o] the Wealth of Nations. 
1776 Reprint ed Edited hv Edwin Cannan. London: Methuen, 1961. 

Smith, James P„ ed. Female Labor Supply: Theory and Estimation. Princeton, 

N | : l’linceion L'niv. Press, 1980. 

Smith, James P , and Ward, Michael P. “Time-Series Growth in the Female 
Labor Force." In Trends in Women's Work. Education, and Family Budding, 
edited hv Riehatd Layard and [acob Mincet. /. Labor Earn 3, no. I, pi. 2 
( January 1985): S59-S90. 

Smith, Richard M. “Fertility, Economy, and Household Formation m F.n- 
gland ovei Three Centuries.” Population and Development Rev 7 (December 
1981): 595-622. 

Snell, Keith M D. “Agt it ultuial Seasonal Unemployment, the Standard of 
Living, and Women's Work in the South and East, 1690—1860." Eicon. Hist 
Rev.. 2d ser. 34 (August 1981): 407—37. 

Statisuka Centtalhyi an. Historical Statistics of Sweden. Vol. 1. Population Stock¬ 
holm: Statistika Centralbyran, 1955. 

Sundbaig, Axel Gustav, lievblkeiungsstatistih Scltwedeus 1750—1900. Stock¬ 
holm: Norstedr and Sfiner, 1907. 

Thomas, Dorothy S. Social and Economic Aspects of Swedish Population Move¬ 
ments, 1750-1933. New York: Macmillan, 1941. 

Thomas, Dotothv S., et al. Population Movements and Industrialization: Swedish 
Counties. 1895—1930. 2 vols. London: King (lor Stockholm Umv., Inst. Soc. 
St i.), 1941. 

L’tlerstrom, G. "Two Essays on Population of 18th Century Scandinavia." In 
Population in History Essays in Historical Demography, edited by David V. 
Class and David E. C. Eversley. Chicago: Aldine. 1965. 



The Structure of Corporate Ownership: 
Causes and Consequences 


Harold Demsetz 

University of California, Los Angeles 


Kenneth Lehn 

Washington ('mvcrsity 


This paper argues that the strumae of corporate ownership varies 
systematically in ways that ate consistent with value maximization. 
Among tlie variables that are empirically significant m explaining 
the variation in ownership structure for 511 U.S. corporations are 
firm size, instability of profit rate, whether or not the firm is a regu¬ 
lated utility or financial institution, and whether or not the firm is in 
the mass media or sports industry. Doubt is cast on the Berlc-Means 
thesis, as tut significant relationship is found between ownership 
concenttation and accounting profit rates for this set of firms. 


Large publicly traded corporations are frequently characterized as 
having highly diffuse ownership structures that effectively separate 
ownership of residual claims from control of corporate decisions. 
This alleged separation of ownership and control figures prominently 
both in the economic theory of organization and in the ongoing de¬ 
bate concerning the social significance of the modern corporation, a 
debate that we join later in this paper . 1 Our primary concern, how¬ 
ever, is to explore some of the broad forces that influence the struc¬ 
ture of corporate ownership. Our conjectures about the determinants 
of ownership structure are examined empirically. 

1 Recent literature that has examined the separation ot ownership and control in¬ 
cludes Jensen and Meckling (1976) and Fama and Jensen (1983a, 19836). The debate 
concerning the social implications ot diffuse ownership ot corporate equity had Us 
genesis in Bcrle and Means (1933). 
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Inspection of ownership data reveals that the concentration of 
equity ownership in U.S. corporations varies widely. For a sample of 
511 large U.S. corporations, table 1 lists the distribution of three 
measures of ownership concentration: the percentage of a firm's out¬ 
standing common equity owned by the five largest shareholders (A5), 
the percentage of shares owned by the 20 largest shareholders (A20), 
and an approximation of a Herfindahl measure of ownership concen¬ 
tration (AH). This sample and these data wall be described more fully 
later in the paper. We simply note here the variation in ownership 
concentration. The value of A5 ranges from 1.27 to 87.14 around a 
mean value of 24.81. Similar variation is found in the values of A20 
and AH: A20 ranges from 1.27 to 91.54 and AH ranges from 0.69 to 
4,952.88. The corresponding average values of these two variables 
are 87.66 and 402.75, respectively. 

We approach the task of explaining the variation in these data by 
considering the advantages and disadvantages to the firm's share¬ 
holders of greater diffuseness in ownership structure. The most obvi¬ 
ous disadvantage is the greater incentive for shirking by oumt'is that 
tesults. The benefit derived by a shirking owner is his ability to use bis 
time and energies on other tasks and indulgences: this beneht accrues 
entirely to him. The cost of his shirking, presumably the poorer per¬ 
formance of the firm, is shared by all owners in proportion to the 
number of shares of stock they own. 4'he more concentrated is own¬ 
ership, the greater the degree to which benefits and costs are borne by 
the same owner. In a firm owned entirely by one individual, all 
benefits and costs of owner shirking are borne by the sole owner. I 11 
this case, no “externalities” confound his decision about attending to 
the tasks of ownership. In a very diffusely owned firm, the divergence 
between benefit and costs would be muc It larger for the typical owner, 
and he can be expected to respond by neglecting some tasks of own¬ 
ership. 

The inefficiency implied by such externalities, of itself, dictates 
against diffuse ownership structures, and we would observe no dif¬ 
fuse ownership structures in a “rational" world unless counterbalanc¬ 
ing advantages exist. Since these advantages do exist, a decision to 
alter a firm’s ownership structure in favor of greater diffuseness pre¬ 
sumably is guided by the goal of value maximization. A theory of 
ownership structure is based largely on an understanding of what 
makes these advantages vary in strength from firm to firm. 


Determinants of Ownership Structure 

Of the possible general forces affecting ownership structure, three 
seem important enough to merit investigation. One of these, the 
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value-maximizing size of the firm, is not surprising. The second, more 
subtle and difficult to measure, is the profit potential from exercising 
more effective control, to which we ascribe the name control potential. 
The third is systematic regulation , the general purpose of which is to 
impose, in one form or another, constraints on the scope and impact 
ot shareholder decisions. In addition to these, we consider the amenity 
potential of firms, about which more will be said below. 


Value-maximizing Size 

The si/c of firms that compete successfully in product and input 
markets varies within and among industries. The larger is the com¬ 
petitively viable size, ceteris paribus, the larger is the firm’s capital 
resources and, generally, the greater is the market value of a given 
fraction ot ownership. The higher price of a given fraction of the firm 
should, in itself, reduce the degree to which ownership is concen¬ 
trated. Moreover, a given degree of control generally requires a 
smaller share of the firm the larger is the firm, both these effects of 
si/e imply greater diffuseness of ownership the larger is a firm. This 
may be termed the risk-neutral ellect ot si/.e on ownership. 

Risk aversion should reinforte the risk-neutral effect. An attempt 
to preserve ef fective and concentrated ownership in the face of larger 
iapil.il needs requites a small group of owners to commit more wealth 
to a single enterprise. Normal risk aversion implies that they will 
purchase additional shares only at lower, risk-compensating prices. 
This increased cost of capital discourages owners of larger firms from 
attempting to maintain highly concentrated ownership. 

As the value-maximizing size of the firm grows, both the risk- 
neutral and risk-aversion effects of larger size ultimately should 
weigh more heavily than the shirking cost that may be expected to 
accompany a more diffuse ownershij) structure, so that an inverse 
relationship between firm size and concentration of ownership is to be 
expected. Larger firms realize a lower overall cost with a more diffuse 
ownership structure than do small firms. The choice by owners of a 
dif f use ownership structure, therefore, is consistent with stockholder 
wealth- (or utility-) maximizing behavior. 


Control Potential 

Control potential is the wealth gain achievable through more ef fective 
monitoring of managerial performance by a firm's owners. If the 
market for corporate control and the managerial labor market per¬ 
fectly aligned the interests of managers and shareholders, then con¬ 
trol potential would play no role in explaining corporate ownership 
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structure (although it might then explain the degree to which own¬ 
ership by professional management is concentrated). We assume, 
however, that neither of these markets operates costlessly. In addition 
to the transaction and information costs associated with the acquisi¬ 
tion and maintenance of corporate control, Jarrell and Bradley (1980) 
have shown that there are significant regulatory costs associated with 
control transactions. These nontrivial costs act effectively as a tax on 
corporate control transactions. Although we are unaware of similar 
empirical studies of transaction costs associated with the managerial 
labor market, we assume that this market also imperfectly disciplines 
corporate managers who work contrary to the wishes of shareholders. 
Our view is that these transaction costs impose a specific identity and 
control potential on firms. Alterations in the structure of corporate 
ownership, in part, can be understood as a response to ihese costs. 

We seek to uncover elements of a firm's environment that are per¬ 
vasive and persistent in their effect on control potential. Firm-specific 
uncertainty is one such factor. Firms that transact in markets charac¬ 
terized by stable prices, stable technology, stable market shares, and 
so forth are firms in which managerial performance can be monitored 
at relatively low cost. In less predictable environments, however, man¬ 
agerial behavior simultaneously figures more prominently in a firm’s 
fortunes and becomes more difficult to monitor. Frequent changes in 
relative prices, technology, and market shares require timely man¬ 
agerial decisions concerning redeployment of corporate assets and 
personnel. Disentangling the effects of managerial behavior on firm 
performance from the corresponding effects of these other, largely 
exogenous factors is costly, however. 2 Accordingly, we believe that a 
firm’s control potential is directly associated with the noisiness of the 
environment in which it operates. The noisier a firm’s environment, 
the greater the payoff to owners in maintaining tighter control. 
Hence, noisier environments should give rise to more concentrated 
ownership structures." 1 

Clearly, we take the view that owners believe they can influence the 
success of their firms and that all outcomes are neither completely 
random nor completely foreseeable. This belief constitutes an asser¬ 
tion of the existence of risks, opportunities, and managerial shirking 
that are in some degree controllable by owners for the profit of own- 


1 The effect of imperfect information on monitoring costs is developed formally in 
Holmslrom (1979, 1982). 

’ An interesting variant of the hypothesis that corporate ownership structure is, in 
part, dependent on the stability of a firm's environment is found in Smith (1937. pp 
713—14): “The only trades which it seems possible for a joint stock company to carry on 
successfully, without an exclusive privilege, are those, of which all the operations are 
capable of being reduced to what is called a routine, or to such a uniformity of method 
as admits of little or no variation." 
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ers. The profit potential from exercising a given degree of owner con¬ 
trol is, we believe, correlated with instability in the firm’s environment. 

This instability may be measured in many ways, by fluctuations in 
product and input prices of relevance to a firm, for example, or by 
variations in a firm’s market share. We rely on instability of a firm’s 
profit rate, measured by variation in both stock returns and account¬ 
ing returns. Profit data are readily available, and profit variability 
offers a global measure of the impact of the various subcomponents 
of instability in its environment; profit also is the “bottom line” that so 
interests stockholders. 

The three measures of instability examined here are (1) firm- 
specific risk (SE), as measured by the standard error of estimate cal¬ 
culated from fitting the “market model,” (2) the standard deviation of 
monthly stock market rates of return (STD,), and (3) the standard 
deviation of annual accounting profit rales (S I D,,). Our intuition 
favors firm-specific risk as the factor most strongly associated with the 
type of instability for which control is most useful. The exercise of 
control should be particularly important to those operations of a firm 
that can be influenced and responded to most easily. 1'hese would 
seem to include the inner f unctioning of the firm and its operations in 
the markets in which it purchases and sells. These are proximate and 
specific to the firm. In contrast to these sources of instability, econo¬ 
mywide events such as the rate of growth of money supply or fluctua¬ 
tions in government tax-expenditure flows are beyond a firm's control 
•inti, at best, can be reacted to intelligently. Because of these reactive 
possibilities, even this more distant and less firm-specific instability is 
likely to call forth more concentrated ownership, but greater control 
potential is offered by instability that is more specific to the firm. 

We include instability in accounting rates of return among our 
tueasuies, though we recognize many defects of accounting data. One 
of these defects is purely statistical: whereas we have collected 
monthly stock return data, our accounting data are annual data. For 
any time period, then, there are 12 times as many observations with 
which to calculate a stock return variance as there are for calculating 
an accounting return variance. Accounting profits, however, may 
reflect year-to-year fluctuations in underlying business conditions bel¬ 
ter than stock market rates of return, since stock market rates of 
return reflect expected future developments that may cloak contem¬ 
porary fluctuations in business conditions. We say “may” because to¬ 
day’s accounting rate of return is influenced by past investment ex¬ 
penditures (and other carryover accounting entries), and this also 
attenuates the impact of “today’s” instabilities. It is not clear on a 
priori grounds which measure is better suited to measure day-to-day 
or year-to-year variability in the firm’s environment. 
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Regulation 

Systematic regulation restricts the options available to owners, thus 
reducing control potential in ways that may not be reflected fully in 
profit instability. Regulation also provides some subsidi/.ed moni¬ 
toring and disciplining of the management of regulated firms. A bank 
whose balance sheet looks too risky to regulators will find itself under 
considerable pressure to replace its management. These “primary" 
effects of regulation should reduce ownership concentration to a 
greater degree than would be predicted simply on the basis of profit 
instability. 

We expect the net impact of regulation to be dominated by these 
primary effects, which call for greater ditfuseness of ownership in 
regulated industries. There are also well-known problems of amenity 
consumption by management in a regulated setting. These should be 
more important than in nonregulated firms because cost-plus price- 
setting regulation reduces the incentive to hold down cost while it 
dulls competition. Greater control of management by owners would 
seem to he called for and, hence, greatet concentration of ownership. 
However, owner incentives to reduce managerial amenity consump¬ 
tion are also dulled by the tendency ol commissions to adjust prices 
toward levels that leave the profit rate unchanged, and this 
counteracts the desire for greater control of management. 

Amenity Potential oj a Firm’s Output 

Those who own large fractions of the outstanding shares of a firm 
either manage the firm themselves or are positioned to see to it that 
management serves their interests. Maximizing the value of the firm 
generally serves these interests well, for this provides the largest possi¬ 
ble budget for a shareholder to spend as a “household." The advan¬ 
tage of maximizing profit through the firm and then consuming in 
the household is based on the implicit assumption that specialization 
in consumption is productive of maximum utility. However, when 
owners can obtain their consumption goals better through the firm's 
business than through household expenditures, they will strive to 
control that firm more closely to obtain these goals. Just as the poten¬ 
tial lor higher profit creates a demand for closer monitoring of man¬ 
agement by owners, so does the potential for firm-specific amenity 
consumption. 

We refer here to the utility consequences of being able to influence 
the type of goods produced by the firm, not to the utility derived from 
providing general leadership to the firm. We believe that there is 
nonpecuniary income associated with the provision of general leader¬ 
ship and with the ability to deploy resources to suit one's personal 
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preferences, but we are not now prepared to assert how this varies 
across firms or different ownership structures. 1 However, we do be¬ 
lieve that two industries are likely to call forth tight control in order to 
indulge such personal preferences. These are professional sports 
clubs and mass media firms. Winning the World Series or believing 
(hat one is systematically influencing public opinion plausibly pro¬ 
vides utility to some owners even if profit is reduced from levels 
otherwise achievable. These consumption goals arise from the partic¬ 
ular tastes of owners, so their achievement requires owners to be in a 
position to influence managerial decisions. Hence, ownership should 
be more concentrated in firms for which this type of amenity potential 
is greater. Unfortunately, other than a shared perception that the 
sports and media industries are especially laden with amenity poten¬ 
tial for owners, we have no systematic way of tracking amenity poten¬ 
tial. On balance, we consider amenity potential a more speculative 
explanation of ownership concentration in these special industries 
than are size, control potential, and regulation. 

Data and Measurements 

This study uses ownership data obtained from three directories pub¬ 
lished by Corporate Data Exchange (CDF.): CDF Slock Ownership Direc- 
lory: Energy (1980), Hanking anil Finance (1980), and Fortune 5 00 
(1981). The sample consists of 51 l firms from major sectors of the 
U.S. economy, including iegulated utilities and financial institutions. 
These linns represent all firms for which we were able to obtain 
ownership data, accounting data (front the compustat tape), and 
security price data (from the Center for Research on Security Prices 
[CRSP] tape). We also examine a manufacturing and mining subsam¬ 
ple composed of 406 firms. 

The ownership data consist of a ranking of all publicly identifiable 
stockholders who exercised investment power over 0.2 percent or 
more of the company’s common equity. The CDE used the same 

1 Ad hex examples ot the power ol dominant owner-managers can be given. The 
share puces of Disney, Gulf and Western, and ( hock Full O’Nuls all rose dramatically 
on die deaths of their dominant owners. Allegedly the prices of these stocks had been 
depressed by the policies ol Wall Disney to keep a tonsidetable library of Disney hints 
Iron) television, ot Charles Bluhdorn io use Culf and Weslern to hold a large portfolio 
ol stocks in other companies, and of Charles Black to use Chock Full O’Nuts to main- 
lain large real estate investments All three polities are associated by the financial 
community with the personal preferences ol the then dominant owner-managers ol 
these companies. .Shortly after the deaths ol Disney , Bluhdorn, and Black, share prices 
rose, respectively, 25 percent, 42 percent, and 22 percent. We have no systematic 
procedure for determining when dominant owners are more likely to exercise their 
personal preferences in "non-profit-maximizing ways” except for our belief in ihe 
amenity potential of mass media and sports industries. 
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definition of investment power used by the Securities and Exchange 
Commission (SEC;) in application of 13(f) regulations. Specifically, 
this definition includes all shares over which the stockholder has the 
power to buy or sell. 

The GDE used various SEC forms to secure data, including forms 
3, 4, 13f, 14d-l, and 144, and, in addition, it examined corporate 
proxy statements, secondary offering and merger prospectuses, 
public pension plan portfolios, employee stock ownership plan re¬ 
ports, and foundation and educational endowment portfolios. Where 
institutional investors held shares in a management capacity (e.g., 
investment advisory agreements or trust agreements), the party for 
whom they managed the shares is identified as the holder with invest¬ 
ment power. Similarly, when nominees held stock, the party for 
whom they held the stock is identified as the holder with investment 
power. Holdings by diversified financial holding companies, invest¬ 
ment banks, brokerage firms, and investment company managers are 
listed in the “street name” of the firms when the firms are not holding 
the shares in a management capacity. 

Our statistical work relies heavily on the percentage of shares 
owned by the most important shareholders. A5 and A20, and the 
approximation of the Herfindahl index, AH. Different notation is 
introduced when we discuss institutional and noninstitutional share¬ 
holders. 

In our regression equations we measure the percentage of shares 
owned by the top five and top 20 shareholders by applying a logistic 
transformation to these percentages, using the formula 


log 


percentage concentration 
100 — percentage concentration 


The transformation is made to convert an otherwise bounded depen¬ 
dent variable into an unbounded one. A logarithmic transformation 
is applied to the Herfindahl measure of ownership concentration. ’ 
We designate the transformed variable by prefixing L, as in LA5, 
LA20, and LAH. 

A glance at a simple correlation matrix for A5, A20, and AH indi¬ 
cates that we can expect similar empirical results from using these 
alternative measures. The correlation between A20 and AH is weak¬ 
est, but it is still .71. For purposes of constructing an index of own¬ 
ership concentration, the 20 largest ownership interests establish a 
workable outer limit. Beyond 20, it is difficult to interpret the mea¬ 
sure as a meaningful index of ownership concentration. 


’ Our empirical results remain significant when the equations are estimated using 
nontransformed ownership variables. 
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COKKUA'IION OF OWNERSHIP MEASURES FOR 51 I 

Kh.i'iailu and Nonregulaied Firms 

A 5 A20 


A 20 92 

AH .Hfi .71 


Our measure of firm size (EQUITY) is the average annual market 
value of the firm’s common equity during the period 1976—80, with 
units in thousands of dollars. We have experimented with other size 
measures (e.g., book value of assets), but the genera) nature of the 
statistical result is unaffected by this choice. Since our ownership data 
pertain to the ownership of common equity, we prefer to proxy size 
with a measure ot the value of common equity. Our measures of 
instability of a firm's environment (SE and STD,) are based on stock 
market rates of return as determined by 60 monthly stock market 
returns during the 5-year period 1976—80." Instability measured bv 
the standard deviation in accounting profit rates (STD,,) is based on 
five annual profit rates over the period 1976-80. Dummy variables 
take a value of one if the firm is a regulated utility (UTIL), regulated 
financial institution (FIN), or media firm (MEDIA), and zero other¬ 
wise. 

The second part of our empirical work tests the Berle-Means thesis, 
which implies that diffuse ownership structures adversely affect cor¬ 
porate performance. We test this by assessing the impact of own¬ 
ership structure on accounting profit rate (RETURN,,). In doing so, it 
is necessary to control for other factors that may affect accounting 
profit rate. These other factors include the size of the firm as mea¬ 
sured by the book value of assets averaged over 1976-80 (ASSET) 
and a set of variables that seek to standardize for accounting artifacts. 
These variables are ratios to sales of capital expenditures (CAP), ad¬ 
vertising (ADV), and R Sc D expenses (RD), all measured as averages 
from the 1976—80 time period. 

Table 2 gives summary definitions of all variables used in this pa¬ 
per. Summary statistics for these variables for the 511 firms in our 
sample are shown in table 3. 


Statistical Analysis of Ownership Concentration 

Ordinary least squares (OLS) regression estimates of LA5 on three 
alternative measures of profit instability and four other variables are 

h We calculated SE by regressing the firm's monthly returns on the returns to a value- 
weighted market portfolio. 
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A 5 

A 20 
AH 

K5 

15 

1 : i a. 

KIN 

MEDIA 
EQUITY 
RFIl'RN, 
RK I URN,, 
SE 

SID, 

S I D,, 

CAP 

ADV 

RL) 

ASSFT 


TABI.E 2 

Descrumion ot Variables 

Percentage ot shines ion!rolled by top live shareholders: Semites' CAW 
Stoik Ownership Directories Banking anil Finance (10M0). Fneigy (1980). 
and Fortune 5 HO (1981) 

Percentage ol shares coni!oiled by lop 20 shat eholtlers: sources same 
as A5 

Herfindahl index of ownership concentration. Cali ulatecl hs summing 
the squared petteniagv of shares tontiolled b\ cat It shareholdet . 
sources: same as A5 

Perientuge ol shates conttolleil b\ top live tamilies anil miltstdtials: 
sou 1 res. same as A5 

Percentage of shates conn oiled by institutional investors, sources same 
as A5 

(Kite if him is electric utility, mutual gas pipeline, 01 mutual gas 
distribmoi, /eio othei wise: soutie ccimits'iai 

One if firm is bank, saving and loan institution, itistuance (oinpanv. or 
securities firm, /eio othei wise: source. i.OMrcsi \i 

One if firm is newspaper publishci. book publisliei. iiiaga/me 
publisher, or bioadcasier; zero oiliervvise. soiirie. 1 omitsiai 

Maiket value of eoiumon equiiv m iliousaiicls of dollais (annual 
aveiage. 197(1-80), source: CRM’ 

Stock market tate of lettiru (aveiage mimthls return, 1978-80), 
soul re. CRSP 

Accounting rate ol leitn 11 (annual average ot net income to book value 
of shareholdet s' equitv. 197(5-80), solute 1 ostei siai 

Standaid eirorof estimate from market model 111 wluih bun's average 
mouihlv return (197(5-80) is regiesseil on the aveiage montlilv 
return on value-weighted market portfolio (I97(i—80); source, (.RSP 

Standard deviation of monthly stock market rates of return. I97(i-8(l. 
souite. CRSP 

Standard deviation ol aimual attountmg i.ites ot tetuim 1978-80, 
source, (.osiitsiai 

Ratio of capital expenditures (annual aveiage. 197(5—801 10 total sales, 
source (.omitsiai 

Ratio of advertising expendituies (annual average, 197(5-80) to total 
sales, source, comitmai 

Ratio of leseaith and development expenditures (annual aveiage. 
197b—80) to total vales: source: c omitsi at 

Value of total assets in millions ol dollars (annual average. 1978—HO), 
source, c. omits 1 ai 


shown in table 4. All three measures ot instability are significantly and 
positively related to ownership concentration. In addition to linearly 
estimating ownership concentration as a function of instability, vse 
also estimated this relationship in nonlinear form by including the 
squared value of the instability measure. The squared values ot these 
variables are negatively related to ownership concentration, indicat¬ 
ing that at higher values of these variables the increase in concentra¬ 
tion of ownership associated with given increases in instability di¬ 
minishes. Of the three instability measures, the standard error of 



TABLE 3 

Summary S'Iai ernes oi Variables kir 511 F'irms us Samim i. 


Standard 


Variable 

Mean , 

Deviation 

Minimum 

Maximum 

A 5 

24. HI 

15 77 

1.27 

87 14 

A 2(1 

37 66 

16 73 

1 27 

01.54 

All 

402 75 

722 00 

60 

4,052.38 

15 

<1 OH 

13.03 

0 

60.30 

15 

1H 30 

11 52 

75 

87 14 

l HI. 

.10 

30 

0 

1 

UN 

11 

.31 

0 

1 

MEDIA 

03 

10 

0 

1 

FQl 11 V 

$1,221,754 

$2,008,140 

$22,341 

$40,587,203 

RETURN, 

.017 

.012 

-013 

.074 

REIF RN„ 

23H 

105 

- 077 

.824 

SF 

007 

.025 

031 

308 

SID, 

.0H4 

020 

034 

412 

SI 1)„ 

055 

050 

002 

320 

(.Al> 

OHO 

.103 

0 

841 

A1)V 

011 

023 

0 

.200 

Rl) 

012 

.020 

0 

200 

ASSF 1 

$3,505 

$8.114 

$48 

$04,162 


TABLE 4 

OLS Ls I IMA its oh I.A5 


Intercept 

- 1 53 

-2 10 

1 53 

-2.02 

- 1 20 

- 1.29 


(13.6) 

(110) 

(12 3) 

(10 1) 

(20 3) 

(15.8) 

V l 11. 

- 1 31 

- 1 20 

- 1.27 

- 1.15 

- 1 36 

- 1 33 


(11 1) 

(10 0) 

(10.4) 

(0 0) 

(116) 

(11 3) 

FIN 

- 47 

-.47 

- .45 

-.44 

-.45 

- 45 


(4 2) 

(4 3) 

(4 1) 

(3 0) 

(4.0) 

(4.0) 

MEDIA 

67 

.70 

67 

68 

63 

62 


(3.2) 

(3.4) 

(3.2) 

(3 3) 

(3 0) 

(3.0) 

EQUITY* 

-4.50 

-3.51 

— 4.64 

- 3 00 

- 5 04 

- 5 70 


(3 5) 

(2 7) 

(3 6) 

(3.1) 

(4 6) 

(4 5) 

SE 

6 86 

17 04 





SE s 

(4.8) 

(5.0) 

-30.38 





SI L), 


(4 1) 

5.44 

13 77 



S ID: 

S IT),, 



(4.2) 

(4 7) 
-28.50 
(3.1) 

2.84 

5.49 

STD'J 





(4.1) 

(2.9) 

- 11 78 

N 

51 1 

51 1 

51 1 

511 

511 

(1.5) 

511 

K- 

31 

33 

.30 

.32 

.30 

.30 

f 

45.0 

41.5 

43.6 

38 6 

43.3 

36.5 


Note —r-sumuts arc m parentheses 

* All cocthueni estimate* on FQll 1 Y should he multiplied hv 10'* 
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estimate from the market model enters most significantly, and the 
standard deviation in accounting profit rates enters least sig¬ 
nificantly. 7 

All other variables take the expected signs, and all of the estimated 
coefficients are statistically significant at the .95 level. Size of firm, as 
measured by the market value of equity, is negatively related to own¬ 
ership concentration. 8 The dummy for systematic regulation indicates 
that the average concentration of ownership for the regulated firms is 
significantly less than for other firms. The ownership structure of 
utility firms is affected more by regulation than is that of financial 
firms. Media firms exhibit significantly more ownership concentra¬ 
tion, on average, than other firms, a finding that is consistent with the 
notion that tighter control is required to achieve the amenity potential 
offered by the unique output of these firms. 

The variation in LA5 explained by these equations is at least 30 
percent. When firm-specific risk is the instability measure, 33 percent 
of the variation is explained. The coefficients of all other variables 


7 Two additional specification"! of the ownership equation deserve comment. As an 
alternative proxy for control potential, we hit luded the intraindustry variability (using 
tout-digit SIC codes) in average accounting profit rales (l97(>-80) as an independent 
variable. Plausibly, greater differences in profit rates among firms in the same industry 
provide an index til the difference in performance that can he wrought by superior 
control decisions. However, no significant relationship exists between this new index of 
conuol potential and ownership concentration When it is entered as the sole control 
potential variable, the intraindustry variability of profit rate enters with a positive but 
statistically insignificant coefficient, and it does not significantly affect the other regres¬ 
sion coefficients. When this variable is added to the regression equations in winch SE 
proxies tor control potential, it enters with a positive and statistically insignificant 
c oetfu lent, and it again leaves all other coefficient estimates essentially unaffected. The 
simple correlation of the intraindustry variability of profit rate with SE. STD,, and 
STD,, never exrecds 10 High values of the intraindustry variability of profit rale may 
(orrelate with poor census definitions of industries, or they may reflect accounting 
artifacts that increase the divergence between profit rates within industries, but there is 
no positive evidence of a linkage to control potential. This absence receives 
confirmation from a statistical study that regresses ownership concentration on equity, 
SE, SE*, and 41 dummy variables, one for each two-digit industry containing our 
sample hi ms. 1 he coef ficients of only four industries exhibited statistical significance, 
and these were either mass media or regulated industries. Industry rharacteiistics 
other than these bear no relationship to ownetshtp concentration. This absence or 
significance is puzzling to us, but its implication may be important to industrial organi¬ 
zation studies What the data seem to be saying is that firms are significantly different, 
even within traditional industry classifications, and that many individual firms may 
constitute quasi industries in and of themselves in regard to ownership concentration. 

H We also estimated a regression equation in which we entered the logarithm of 
EQUITY as an independent variable. This variable entered with a negative and statisti¬ 
cally significant coefficient, and its inclusion did not significantly affect the other 
coefficients. Similarly, we included the squared value of EQUITY in addition to 
EQUITY and the other independent variables. EQUITY continued to enter with a 
significant, negative coefficient, and its squared value entered with a positive but 
insignificant coefficient. The other coefficient estimates remained unaffected in this 
equation. 
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and their significance are largely unaffected by the measure of insta¬ 
bility chosen. Most altered is the coefficient on the market value of 
equity, which varies from — 3.51(E-08) for the nonlinear firm-specific 
risk equation to — 5.94(E-08) for the linear equation that includes the 
standard deviation of accounting profit rate. 

Different measures of ownership concentration are regressed on 
identical sets of explanatory variables for two samples of firms in table 
5. The left side of the table continues our investigation of the full 
sample of regulated and nonregulated firms. The right side of the 
table uses a smaller sample that systematically excludes regulated 
firms. Logistically transformed values of the percentage of shares 
owned by the five and by the 20 largest stockholding interests and the 
Herfindahl index are used as alternative measures of ownership con¬ 
centration. We note the large impact of regulation on R~. 

In table 6 w’e measure ownership concentration separately for all 
investors (A5), family and individual investors (F5), and institutional 
investors (15). The percentage of shares owned by the five largest 
shareholding interests (not logistically transformed) of eatli share- 
holdei class is the dependent variable in these regressions/' We exam¬ 
ined these classifications of owners to discover whether the sig¬ 
nificance of the coefficient on the media variable is attributable to the 
behavior of family and individual owners or to institutional owners. 
Since the assumption of amenity potential is strongly governed by 
personal tastes, we do not expect ownership concentration to he 
significantly higher for institutional owners if the firm is a media firm. 

Table 0 reveals that the greater ownership concentration in media 
firms is attributed almost exclusively to greater faintly and individual 
holdings. The coefficient estimate on MEDIA is the identical value, 
15.30, arid it is statistically significant in the equations in which A5 
and F5 are the dependent variables. When 15 is the dependent vari¬ 
able, the coefficient estimate on MEDIA drops to 1.40, and it is not 
statistically significant. These results are consistent with the interpre¬ 
tation we have given to the amenity potential associated with control 
of media firms. 1(1 


'* The variables F5 and 15 occasionally lake a value ot zero, at which point the logistic 
iranslorinalion is uuclchned For purposes ot estimating these equations, we do not 
transform the ownership variables 

111 “.Softer" evidence reintorces the amenity explanation ot ownership concentration 
in the media industry. In 1984, Dow Jones & Company, 5b percent owned by the 
Bancroft family, attempted to issue a stock dividend in the form of a new class of stock 
that would have 10 votes per share compared with the one vote per share of the firm's 
original common equity. The Dow Jones chairman described the rationale behind this 
decision: "The purpose ... is to try to assure the long term f uture operation of The Wall 
Street Journal and Dow Jones’ other publications and services under the same quasi 
public trust philosophy that Clarence Barron and his descendanfs have followed during 
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TABLE 6 

Ownership Concentration by Type oi Owner 



Dependent Variable 
A5 

Dependent Variable 

F5 

Dependent Variable 
15 

Intercept 

9.98 

I 74 

10.66 


(3 0) 

(6) 

(4.3) 

nn. 

- 14 21 

-6.86 

-9.64 


(6.4) 

(3.5) 

(5.7) 

UN 

- 7.32 

- 3.26 

- 5.48 


(3.6) 

(1.8) 

(3.5) 

MEDIA 

1 3 30 

13.30 

1.40 


(3.5) 

(3.9) 

(.5) 

EQUITY* 

- 5 00 

- 3.64 

-2 88 


(2 1) 

(1 7) 

(.16) 

SE. 

306.47 

154 63 

K»f>,85 


(5 4) 

(3.1) 

(3.9) 

SE’ 

-607 I t 

-388.16 

-315 53 


(3 8) 

(2 5) 

(2.3) 

,V 

51 1 

511 

511 

R 1 

23 

10 

16 

I- 

24 7 

9 7 

15 5 


Nimf .»ie in |>«irrmlirsr*% 

* 411 1 ih'I Ik iff il eitJiiMio on l-QL I 1 V On mid lx' iniili In JO 7 


Additional evidence that suggests the amenity potential explana¬ 
tion oi ownership structure is found by examining ownership data for 
professional sports dubs. Although we lack systematic ownership, 
ptofit, and size data for individual clubs, we show in table 7 aggregate 
ownership data for 12 ! clubs in five major sports. These dubs are 
much more tightly controlled than the 511 firms in our sample. 
Among the 121 sports clubs, there are 258 owners, an average of 1 .97 
per dub, who either are general partners or control at least 10 per¬ 
cent of the club’s slock. Among the 511 firms in our sample, the 
corresponding numbers are 218 owners and an average of 0.43 own¬ 
ers per firm. Admittedly, sports clubs are smaller than the 51 1 firms 
in our sample, which in part explains the increased ownership con- 


ihr company's history The Bancroft family always has zealously guarded the integrity 
and independence of the Journal and Dow Jones’ other publications This lias been 
crucial to their growth and financial success, Ibe family . , . also has encouraged 
management always to take a long term view, investing heavily for the purpose of 
building future strength and investment values. The family and the hoard, acting 
unanimously and with management’s enthusiastic support, are seeking to protect and 
build Dow Jones’ publications in the same manner in the years ahead through con¬ 
tinued family control” (“Dow Jones Votes” 1984. p. 5). Similarly, DeAngeJo and 
DeAngelo (1983), in u study of 45 firms that have dual classes of common stock, found 
that both the New York Times and the Washington Post have dual classes of common stock 
that trade with different voting rights. 
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TABLE 7 

Ownership Data on 121 Sports Firms and 511 Nonspohts Firms 


Sample 

Number 
of Fit ms 

Number of 
Shareholders 
Owning 10 
Percent or 

More of the 
Firm's Shares 

Number of 
Shareholders 
Owning Id 
Percent or 
More of the 
Firm’s Shares 
per Firm 

Sports clubs 

Major league baseball 

26 

54 

2 1 

North American Sorter 
League 

24 

54 

2.3 

National Basketball 
Association 

22 

52 

2.4 

National Football League 

28 

38 

1.4 

National Hockey League 

21 

40 

1.9 

All sports clubs 

121 

238 

1.97 

Demseu-Lehn sample 

511 

218 

.43 


Not' k< t.—Por s|xirM ddtrf, Nortli Ameiic.in Sracrr Learie <N ASL) v NKL.no 78. Liv 45bO-C,Stl I'S Disiria 
Couii. S L) New York. fcelmiars 21, 1979 


centration in die sports industry, and they may operate in less stable 
environments, although we do not know this to be a fact. Nonetheless, 
these data are consistent with the amenity explanation of ownership 
structure. 

The impact of regulation on ownership concentration is examined 
from another perspective in table 8. Salomon Brothers rates the regu¬ 
latory climates in which electric utility firms operate, assigning letter 
grades based on such factors as the allowed rate of return, the rate 
base test period used, the cost items allowed in the rate base, and the 
time taken by commissions to decide rate appeals. The 1979 rating we 
use is an average of regulatory jurisdictions, calculated by using reve¬ 
nue weighting for utilities operating in more than one jurisdiction. 
We divide the electric utilities in our sample into two groups: those 
that operate in regulatory climates that are less "stringent" than the 
median (i.e., “more favorable” for investment purposes) and all other 
electric utilities. The dummy variable, REGULATORY CLIMA TE, 
takes the value of one if the utility is in the former group and zero 
otherwise. 

We expect that this index of regulatory climate is positively related 
to ownership concentration, less stringent regulation offering owners 
more control potential through fewer restrictions and less commis- 
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sion monitoring of management. In all three equations where the 
instability measure enters Linearly and in the equation where STD,, 
enters nonlinearly, REGULATORY CLIMATE enters with a sig¬ 
nificant and positive estimated coefficient. When SE and STI) s enter 
in nonlinear form, the estimated coefficient on REGULATORY CLI¬ 
MATE remains positive but is not significant. The firm size variable, 
contrary to expectations, is not significantly related to ownership con¬ 
centration. AH three of the instability measures are significantly re¬ 
lated to ownership concentration in the anticipated direction. 

The Separation Issue 

The discussion to this point has focused on the determinants of own¬ 
ership structure. We now empirically examine the alleged consequence 
of diffuse ownership structures for the separation of ownership and 
control. Berle and Means brought the issue to center stage in 1933 
with the publication of The Modem Corporation and Private Property. 
Their interpretation of the issue has remained the focus ol debate tor 
more than half a century. Diffuseness in ownership structure, by 
modifying the link between ownership and control, is seen by them as 
undermining the role of profit maximization as a guide to resoun e 
allocation. Diffuseness of ownership is said to render owners of shares 
powerless to constrain professional management. Since the interests 
of management need not, and in general do not, naturally coincide 
perfectly with those of owners, this would seem to imply that corpo¬ 
rate resources are not used entirely in the pursuit of shareholder 
profit. Although Berle and Means make no great effort to describe 
how corporate resources are allocated, later discussions of the corpo¬ 
ration dwell on management's consumption of amenities at the ex¬ 
pense of owner profits. 

Berle and Means’s work was anticipated by Thorstein Veblen's 
(1924) volume, The Engineers and the Price System. Veblen believed that 
he was witnessing the transfer of control from capitalistic owners to 
engineer-managers and that the consequences of this transfer were to 
become more pronounced as diffusely owned corporations grew in 
economic importance. In the wake of this transfer of power, Veblen 
saw the end of the type of profit seeking he associated with capitalists, 
for he believed that capitalistic owners sought neither efficiency nor 
increased output so much as monopolistic restrictions to raise prices. 
The engineers, trained and acculturated to seek technological 
efficiency, would see to it that the production from the firms they now 
controlled would rise to higher and socially more desirable levels. The 
profits of monopoly would be sacrificed on the altar of efficiency. 

One of Veblen's famous disciples, John Kenneth Galbraith, shared 
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his teacher’s assessment of the change in control but evaluated the 
outcome differently. In The New Industrial State (1967) he argued that 
the technocrats who had gained control of the diffusely owned mod¬ 
ern corporation would sacrifice owner profit to increased output be¬ 
yond levels that served the real interests of consumers. Enticed to 
purchase these large output rates by powerful advertising campaigns, 
consumers would cause the private sector to grow too rapidly and at 
the expense of the public sector. 11 

Although the three views discussed above concerning the conse¬ 
quences ot diffuse ownership structures offer somewhat different 
evaluations, they unanimously imply a positive correlation between 
ownership concentration and profit rate. If diffuseness in control 
allows managers to serve their needs rather than tend to the profits of 
owners, then more concentrated ownership, by establishing a strong¬ 
er link between managerial behavior and owner interests, ought to 
yield higher profit rates. 

We expect no such relationship. A decision by shareholders ter alter 
the ownership structure of their firm from concentrated to diffuse 
should be a decision made in awareness of its consequences for 
loosening control over professional management. The higher cost 
and ieduced profit that would be associated with this loosening in 
owner control should be offset by lower capital acquisition cost or 
other profit-enhancing aspects of diffuse ownership if shareholders 
choose to broaden ownership. Standardizing on other determinants 
ot profit, Demset/ (1983) has argued that ownership concentration 
and profit rate should be unrelated. 

1 able 9 reports recursive estimates for coefficients of a profit rate 
equation in which the key independent variables are alternative pre¬ 
dicted measures of ownership concentration: I.A5, I.A20, and 
I.AH. “ 1 he dependent variable is the mean value of annual account¬ 
ing profit after taxes, as a percentage of the book value of equity. The 
mean is calculated for the 5-vear period 1976-80. Stock market rates 
of return presumably adjust for any divergences between the interests 


11 The entile discussion of the sepaiation thesis presumes that diftuseness of own- 
eiship is a pervasive phenomenon. Our data cast doubt on this presumption Our 
sample is heavily weighted by Fortune 500 firms, precisely the hrnis that ate supposed 
to suffer from diffuse ownetship structures Yet the mean values ol A5 and A20. te- 
speoively. ate *24.K percent and 57.7 percent 

12 Tile predated measutes ot 1.A5, L.A20, and LAM wete estimated from an OLS 
equation that included the following independent variables’ UT'IL, FIN. MEDIA. 
EQUITY, SE, and SE'. I he results reported in tabledo not change significantly when 
the ownership equations are estimated using alternative spei ificalions that were previ¬ 
ously reported. 
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TABLE 9 

Rec.ursivv Eshmaies of Mean A((.oini IN(, Profit Ratf 


Intercept 

.24 

.27 

35 


(6.2) 

(11 7) 

(4 2) 

UTIL 

- 13 

-.10 

- 13 


(3.4) 

(2 4) 

(31) 

FIN 

-.07 

-.06 

-.07 


(3.6) 

(3 3) 

(3.5) 

CAP 

.04 

.05 

.04 


(.7) 

(.8) 

(.7) 

ADV 

.42 

.47 

42 


(1 9) 

(2.3) 

(1 9) 

RD 

- 11 

-.07 

-.11 


(4) 

(3) 

(•4) 

ASSET * 

5.70 

8 14 

5.97 


( H) 

(1.2) 

(.9) 

SE 

-.29 

-.43 

- 29 


(1 1) 

(2.0) 

(1 1) 

LAS 

- 02 
( 9) 



LAUD 


- .004 
(2) 


LAH 



- .02 

(.9) 

,V 

511 

511 

511 


10 

.10 

10 

F 

7.2 

7.2 

7.1 

N«mi —/-statistics arc in parentheses 



* Coefficient 

estimates on ASSt l ate 

multiplied l»v U> 7 



of professional management and owners, so we rely on accounting 
rates of return to reveal such divergences. 

In addition to ownership concentration, we include several other 
independent variables in this equation. The utilities and financial 
dummies isolate the impact of systematic regulation. The coefficient 
on the financial dummy may be explained by accounting procedures, 
which for these firms include outstanding loans in the asset base. T he 
potential upward bias in asset measurement that results is likely to 
depress the measured accounting profit rate. Capital, advertising, and 
R & D expenditures, all as a percentage of sales, standardize for 
accounting artifacts associated with the decision to expense some of 
these investments but to depreciate others. The size of the firm is 
measured by the book value of assets. 

The general explanatory power of the profit rate equation is quite 
low, but regulation does seem to have a negative impact on account¬ 
ing profit rate. Table 9 shows no significant relationship between 
ownership concentration and accounting profit rate, and especially no 



1 1 76 JOURNAL OF POLITICAL ECONOMY 

significant positive relationship.The data simply lend no support to 
the Berle-Means thesis. 11 

We have suggested above that certain industries may be character¬ 
ized as offering greater amenity potential and that this would lead to 
more concentrated ownership. This does not assert that the more 
concentrated is ownership, the greater the tendency to cater to amen¬ 
ity potential. If we were to make such an assertion it would imply a 
negative correlation between profit rate and ownership concentra¬ 
tion, and this would tend to hide the opposite correlation suggested 
by Berle and Means. But, then, this would constitute no evidence 
more favorable to the Berle-Means hypothesis. Catering to amenity 
potential is maximizing owner utility if not owner profit. Such maximi¬ 
zation hardly constitutes evidence of a separation between ownership 
and control. 


Concluding Comments 

We have argued, both conceptually and empirically, that the structure 
of corporate ownership varies systematically in ways that are consis¬ 
tent with value maximization. Understanding some of the forces that 
determine corporate ownership structure is valuable in its own right, 
but we also think that our results are germane to a more general 
theory of property rights. For example, can the land enclosure move¬ 
ment in England be explained in part by the enhanced control poten¬ 
tial of landownership during periods of population growth and rising 
prices of farm and ranch products? Similarly, does greater predict¬ 
ability of an industry’s environment make industry regulation politi¬ 
cally more tolerable because collectivization of control is likely to be 
less damaging in such cases? Our analysis suggests a framework for 
new studies that may shed some light on these broader questions. 


M Noi reported here is a replication of table 9 in which the profit rate equation is 
estimated using the actual, not predicted, value of the ownership variable. No changes 
m conclusions are called for by this replication. We also replicated table 9 on a set of 
hrins for which we were able to obtain the industry four-firm concentration ratio. The 
concentration ratio enters the profit rale equation with a negative and statistically 
significant sign, but the coefficient estimates on all three ownership variables remain 
not significant. Earlier studies of the profit-concentration relationship show a weaken¬ 
ing of the usual positive correlation during periods of rising price levels. The negative 
relationship revealed in our work may, therefore, reflect the inflationary tenor of the 
late 1970s. The estimated ownership equation for this subset of firms performs weaker 
than it does when estimated for the entire sample. 

1,1 Our results are consistent with those of Stigier and Friedland (1983). They reject 
the separation thesis by demonstrating that management salaries are no higher in 
"management-controlled" than in “owner-controlled" industries. 
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Some Colonial Evidence on Two Theories of 
Money: Maryland and the Carolinas 


Bruce D. Smith 

/'id rial lira in Hank nf Minneapolis anil Caruegte-Mellon Ihuversth 


Recent developments in monetary economics stress the nature of 
monetary injections, emphasizing that they have implications lor the 
relationship between money and priies. In contrast, traditional ap¬ 
proaches posit stable money demand functions that are independent 
oi how money is in jet led. The Conner approach implies that certain 
proportionality relations between money and prices need not obtain. 
This permits the two approaches to be empirically distinguished, but 
only if an appropriate "experiment" is conducted. The colonial pe¬ 
riod is one such experiment. Colonial evidence suggests that the 
nature of injections is crucial to the ellett on prices of changes in the 
money supply. 


One of the most profound recent developments in monetary theory 
has been a rethinking of how the value of money is determined. In 
particular. Sargent (1982) and Wallace (1981) have stressed the im¬ 
portance of how money is introduced in determining its value. This 
contrasts strongly with the view that the value of money is determined 
by its quantity (or lime path) in conjunction with a demand function 


I lit views expressed herein are those of'the author and not necessarily those of the 
federal Reserve Bank of Minneapolis or die Federal Reserve System. In writing this 
paper I have had the benefit of extensive conversations with John McC-usker, Russ 
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for money that is quite stable over time and reasonably invariant with 
respect to the nature of monetary injections (contractions), l'he Sar- 
gent-Wallace view suggests that one can expect to find in history large 
monetary expansions (contractions) that were not accompanied by 
eroding (increasing) currency values if these expansions were pro¬ 
duced in an appropriate way. Thus one expects to find historical 
instances that permit the quantity theory of money to be contrasted 
with that of Sargent and Wallace. 

In fact, several such instances have been examined that provide 
indirect support for the Sargent-Wailace approach. Sargent (1982) 
discusses how four hyperinflations were ended by a change in the 
nature of backing for currency despite continued high rates of 
growth in the money supply. Riley and McCusker (1983) document 
sustained per capita growth in the money supply of France (1650— 
1788) while price levels fell. Also Smith (1985) presents evidence 
from some of the British colonies in North America (1720-70) that 
both very rapid growth and contraction of money stocks occurred 
without resulting in price level changes or exchange rate movements. 
This is attributed to the nature of backing for money, and cross- 
sectional evidence from several of the colonies is produced to show (a) 
that better backing of currencies resulted in more stable currency 
values and ( b) that when backing was relatively amply provided rela¬ 
tive rates of growth in the money stock across colonies were not 
strongly related to relative rates of inflation or currency depreciation. 

This paper is an attempt both to expand the body of evidence 
against the quantity theory and, for the first time, to present some 
direct evidence on the Sargent-Wailace approach. In particular, 
Smith (1985) examined primarily Massachusetts, Rhode Island, New 
York, Pennsylvania, New Jersey, and Virginia. Phis paper examines 
price levels and exchange rates in the Carolinas and shows that they 
are poorly accounted for by changes in the money supplies of those 
colonies. It then turns to an examination of the monetary system of 
Maryland, which is particularly well suited to provide direct evidence 
for or against the Sargent-Wailace approach to determining the value 
of money. The experience of the colony turns out to be generally 
supportive of the Sargent-Wailace view. 

The reason for focusing on Maryland derives from the nearly 
unique method adopted by that colony for backing its currency. Each 
of the colonies (at least ostensibly) backed its currency in some man¬ 
ner. Typically, currencies were backed either with future tax receipts 
or with mortgages (usually on land or metal plate). A time path for 
the value of this backing is generally impossible to obtain from exist¬ 
ing data. However, Maryland backed the largest component of its 
note issues with the proceeds of a sinking fund invested in Bank of 
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England stock. At preannounced dates (which were met in practice) 
some portion of the outstanding stock of these notes was to be con¬ 
verted into sterling (or, more precisely, sterling bills of exchange, 
described below) at a specified rate. Thus a large component of 
Maryland paper money was a claim to future delivery of sterling. 
As the Sargent-Wallace view suggests that the value of money can 
be determined in essentially the same way as the value of privately 
issued claims, as Maryland notes were a claim against the sink¬ 
ing fund, and as there are fairly complete data on the market value of 
this sinking fund, a particularly appropriate setting is provided 
in which to gather some empirical evidence on the Sargent-Wallace 
view. 

The results of the paper are as follows. The quantity of money in 
circulation does not account well for the time path of prices or ex¬ 
change rates. In South Carolina, for instance, the per capita stock of 
paper money more than tripled from 1755 to 1760. The price level 
increased 7 percent over this same period, while exchange rates be¬ 
tween South Carolina currency and sterling remained constant. From 
1760 to 1770, on the other hand, the per capita paper money stock 
declined by 63 percent. The price level rose 1 percent, and exchange 
rates depreciated 2 percent. Similarly, from 1760 to 1768 the per 
capita stock of paper money in North Carolina was halved, while the 
exchange rate between North Carolina currency and sterling ap¬ 
preciated only 5 percent. (There are no existing price indices for 
North Carolina at this time.) It will be argued that these facts are 
irreconcilable with the quantity theory. 

Evidence shows that the Sargent-Wallace approach finds much 
more support in the data. Regression results indicate that the quantity 
of money in circulation in Maryland had no effect on exchange rates. 
However, both the value of the sinking fund and variables relating to 
Maryland’s track record for redeeming notes on schedule affect the 
exchange rate strongly and with signs that corroborate the Sargent- 
Wallace viewpoint. Thus the overall picture arising from the evidence 
presented here is that the value of money in the colonial period ap¬ 
pears to have been determined in much the same way as is the value 
of privately issued liabilities. 

The format of the paper is as follows. Section I provides a brief 
description of the important features of colonial monetary arrange¬ 
ments. Sections II and III discuss the alternative theories of money 
that are under consideration here. Section IV discusses the experi¬ 
ences of the Carolinas with regard to currency values and shows that 
they are inconsistent with the quantity theory. Section V examines the 
relation between currency values and the value of backing for cur¬ 
rency in Maryland. Section VI presents conclusions. 



TWO THEORIES OF MONEY 


ll8l 


I. Colonial Monetary Arrangements 

The term “money” applied to the colonies has been used in different 
ways by different authors. At its broadest the term “money” includes 
specie, various kinds of paper money, monetized commodities (i.e., 
commodities that were legal tender as well as circulating warehouse 
receipts for commodities), bills of exchange (circulating, privately is¬ 
sued liabilities), and book credit extended by merchants. Of these, 
contemporary usage included in the term only specie, paper money, 
and commodity monies. In this section I provide an overview of these 
various assets as a prelude to examining how the two theories dis¬ 
cussed above fit the data. 

Local units of account in each colony were pounds colonial cur¬ 
rency. 1 There were flexible exchange rates between the currency of 
each colony and sterling. Colonial currency itself took two forms. One 
was specie. The specie circulating in North America at this time was 
coined primarily in Spanish and Portuguese colonies and was de¬ 
nominated in the units of account of those colonies The amount in 
circulation was outside of colonial control, being determined by trade 
Hows and the specie holdings of immigrants. 

The currency denominated in the local unit of account was paper 
currency, which was issued in amounts determined by the legislature 
of each colony (subject to approval of colonial governors, proprietors, 
and the crown). This paper took two forms: bills ol credit issued bv 
colonial institutions known as loan offices or land banks and bills of 
credit issued directly by colonial treasuries. Notes issued by treasuries 
were used to cover shortfalls ot tax receipts, and these notes were 
introduced into the economy by direct payment for goods or services 
provided to the government. One exception to this statement is that 
Maryland injected some notes via lump-sum transfers. 

Notes issued by loan offices were introduced in a more complex 
way. In the colonies and period under consideration there were no 
private banks. Rather, most colonies operated land banks, which is¬ 
sued notes that were lent to private individuals and secured by mort¬ 
gages on land or on plate. The interest rates charged on these loans 
appear generally to have been below market rates. In addition, a 
number of rules governed operation of these loan offices, which were 
meant to provide secure backing for the notes. These included provi¬ 
sions that the amount lent by the loan office was not to exceed half the 
value of the property mortgaged.* 


' Except in Maryland alter where dollars became the unit ol account 

' This practice of issuing notes backed by land may appear reminiscent ot die real- 
bills doctrine. However, the quantity of notes issued was hxed exogenously by colonial 
legislatures. 
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The notes issued in these two ways were (as has been pointed out) 
the only types of currency actuaJJy denominated in the local unit of 
account. For much of the period under consideration they were legal 
tender. Colonial governments were obligated to accept these notes at 
face value in payment of taxes and in repayment of loans issued by 
colonial land banks. In addition, in the colonies at hand they were 
issued specifically to provide a medium of exchange in the light of the 
shortcomings of commodity monies and the problems attendant on 
the use of specie as a medium of exchange. 3 

It should also be noted that although many authors refer to these 
notes as hat money, all colonial note issues were (at least ostensibly) 
backed in some manner. In the case of notes issued by loan offices, as 
the principal of a loan was repaid provisions were made for its retire¬ 
ment at specified dates. In the event of default, mortgaged property 
was to be seized and auctioned off and the proceeds used to retire 
notes. Notes issued by colonial treasuries were backed by future tax 
receipts. In particular, at the time such a note issue was authorized, 
future taxes were earmarked to be used to retire the notes. This 
system was meant to prevent the accumulation of any long-term gov¬ 
ernment debt, although as we will see, different governments backed 
then notes with greater or lesser degrees of scrupulousness. 

In addition to specie and paper currency, each of the colonies ex¬ 
amined in this paper had a commodity money system. In Maryland 
tobacco was legal lender. Until 1747 people trading in tobacco used 
the actual commodity in transactions. After 1747 Maryland in¬ 
troduced a system of colonial warehouses and the use of tobacco 
notes, which were simply negotiable warehouse receipts for tobacco 
stored. In North Carolina several commodities were legal tender, and 
the government of the colony established rates at which each was to be 
accepted in payments due the government. Unlike most other col¬ 
onies, which discontinued their commodity money systems when 
sufficient paper money had been issued, this arrangement persisted 
in North Carolina throughout much of the period in question. 

In addition to these types of money, some historians include book 
credit and bills of exchange as part of the colonial “transactions 
media.” Book credit was simply credit extended by merchants to cus¬ 
tomers, and bills of exchange were circulating, privately issued 


' There were several suc h problems. One is that much ot the specie ttrculating in 
Noilh Ameiita was badly worn and underweight. Hence it was necessary to compen¬ 
sate lor this in transactions A second is that much specie circulated in relatively large 
denominations. For the reason |ust mentioned, there were not constant returns to scale 
in the division of specie. Hence the use of specie in standard transactions created 
problems, as did tax payments in specie. This problem is frequently mentioned in the 
literature. Iwo interesting papers on this topic are by Hanson (1979, 1980). 
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liabilities. In this sense they may appear similar to modern bank 
liabilities. However, this similarity does not extend very far. Bills of 
exchange were not convertible into currency on demand, but rather 
carried a maturity date. Moreover, there were often many copies of a 
single bill in existence. If the holder of one of these copies presented 
it for (illegitimate) repayment, the legitimate holder of the bill would 
need to and often did have to contest payment in court (Gould 1915, 
p. 38). Finally, bills of exchange appear to have been used only in 
relatively large denominations. Hence they appear to have been much 
more like privately issued assets for which secondary markets exist 
than like bank deposits. 

Given this overview of colonial monetary arrangements, we may 
now turn to a description of the two alternative theories of money that 
will be used to try to explain the colonial experience. 


II. A Version of the Quantity Theory 

According to Lucas (1980, p. 1005), one of the “two central implica¬ 
tions of the quantity theory [is] that a given change in the rate of 
change in the quantity of money induces ... an equal change in the 
rate of price inflation.” According to Schwartz (1973, p. 264), at least 
since Alexander the Great, “long-run price changes consistently par¬ 
allel . . . monetary changes,” which is argued to be a verification of 
quantity-theoretic views. 

How are we to check, then, whether these views are consistent with 
colonial monetary arrangements, in which each colony had its own 
paper currency exchanging with sterling at market-determined rates? 
The approach adopted here is one applied to Latin America by Vogel 
(1974), which is to match price level movements (or in some cases here 
exchange rate movements) with changes in the quantity of money 
issued by each of the colonies, that is, with changes in the stock of 
paper money outstanding in each colony. In fact, because data on 
specie, quantities of commodity monies, circulating bills of exchange, 
and so forth are not available, there is really no choice other than to 
attempt to do this. Moreover, this approach coincides quite well with 
quantity-theoretic implications when applied to New England before 
1750 (Smith 1985). However, because matching paper currency 
movements with price level (or exchange rate) movements omits 
many things that a quantity theorist might in principle wish to con¬ 
sider, I argue below that this approach does no great violence to the 
quantity theory. 

What is omitted by focusing on movements in the stock of paper 
currency? First, as indicated in the previous section, liabilities of pri¬ 
vate agents such as book credit or bills of exchange are not consid- 



1 184 JOURNAL OF POLITICAL ECONOMY 

ered. However, as argued above, bills of exchange appear to have had 
many of the attributes of modern privately issued liabilities for which 
secondary markets exist. Such liabilities are not included in modern 
money supply measures. Similarly, book credit was simply credit ex¬ 
tended by merchants to customers. Such credit extensions are also not 
included in modern attempts to implement the quantity theory em¬ 
pirically. Hence omission of these items would not appear to do any 
violence to the quantity theory. 

With respect to commodity monies, two facts should be noted. One 
is that in the Carolinas, and in Maryland before 1747, commodity 
notes were not in use. Thus exchanges with commodity monies were 
simply trades of commodities. The government of North Carolina, 
for instance, fixed legal rates at which selected commodities would be 
accepted in lieu of specie. Since the legal rate established on com¬ 
modities generally could not have corresponded to market-clearing 
prices, it seems unlikely that such rates obtained in private transac¬ 
tions. More probably they obtained only in transactions with the gov¬ 
ernment when it was advantageous to make payments with certain 
commodities. Thus it seems an open question whether these are to be 
regarded as monetary transactions. 

Second, even once the system of commodity notes was introduced, 1 
commodity monies did not enjoy the same general acceptability or 
citculate so widely as paper currency. According to McCusker (1976, 
p. 97), “the major characteristic distinguishing colonial bills ol credit 
from commodity notes was their widespread acceptability." In fact, it 
appears that in Virginia (which has been more extensively studied 
than the other colonies) tobacco notes virtually did not circulate at all, 
except to transfer title to tobacco. These statements apply even more 
strongly to commodity money systems without commodity notes. 
Hence omission of commodity monies does not seem a particularly 
important problem. Finally, McCusker (1976, p. 95) likens commod¬ 
ity notes to “modern warehouse certificates [which] have a negotiable 
character.” These are not included in modern attempts to implement 
the quantity theory. Hence their omission does not seem out of line 
with standard practice. 

Last, the approach taken here omits the quantity of specie in circu¬ 
lation from the measured money supply. While it is unfortunate to be 
forced to omit this, I will still argue that its omission does not bias the 
results in any important way. First, as has been noted previously, 
specie circulating in the colonies was primarily of Spanish and Portu¬ 
guese origin and was not denominated in the unit of account of any 


4 Commodity notes were circulating warehouse receipts Tor the commodity in ques¬ 
tion. as will be recalled from Sec. I. 
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colony. Second, money issued by foreign governments circulating 
within the borders of another country is not included in modern 
attempts to implement the quantity theory. Hence this omission is not 
out of line with standard practice. Third, in the colonies under con¬ 
sideration here, specie omission is not particularly detrimental. In 
North Carolina “it appears certain that there was never any substan¬ 
tial amount of coin in the colony throughout the period” (Brock 1975, 
pp. 107-8). In Maryland, specie circulated at a market-determined 
exchange rate with notes within the colony. Hence it would be inap¬ 
propriate to look at a sum of notes and specie. And finally, even in 
South Carolina, “a paper bill of credit, with a distinct, explicit value in 
colonial currency, was naturally to be preferred over any given coin, 
the value of which in colonial currency was uncertain or, at least, 
debatable. Not only did a gold or silver coin bear no indication of its 
value in colonial currency, but its value depended on its weight and 
condition, factors not easily measured by individual colonists” 
(McCusker 1976, p. 97). Thus, even in South Carolina, it is not un¬ 
reasonable to proceed as if there were flexible exchange rates between 
specie and paper currency. 

Lest one be unpersuaded by these arguments, however, 1 should 
note the following. In asking whether the quantity theory can con¬ 
front colonial monetary phenomena, the approach will be to match 
paper currency movements with movements in prices and exchange 
rates. It will be seen that for these colonies, as for most of the colonies 
examined in Smith (1985), these movements match very poorly (even 
over long periods). It might be suspected that this is due to one of two 
factors: either (1) paper currency was not a large component of the 
“money supply” (appropriately defined), or (2) changes in the stoi k of 
paper currency were offset by specie Hows. 

In fact, neither of these views is tenable. With regard to the first 
point, conservative contemporary estimates placed the components of 
the money supply (which according to contemporary usage meant 
specie and paper currency) late in the colonial period at about '/1 
specie and % paper currency.” As we have seen, this seems conserva¬ 
tive for at least some of the colonies at hand. Thus paper currency 
circulation was not so small that even large increases (or reductions) in 
it did not have significant impacts on the money supply. 

With regard to the second point, this view also does not bear close 
examination. First, there is no evidence in favor of it. Second, during 
much of the period at hand there are reasons to think either that the 


^ See McCusker (1978, p. 7, n. 9). More strongly. Adam Smith (1776/1937. p. 3071 
asserts that “almost all the ordinary transactions of its [North Americas) interior com¬ 
merce [are] being thus carried on by paper.” 
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reverse happened or at least tiiat specie flows large enough to offset 
paper currency movements could not have occurred. In North 
Carolina, for instance, it has been noted that there was never any 
significant amount of specie in the colony. In Maryland this view is 
also not tenable. During the period we examine there were two in¬ 
stances of large increases in the quantity of paper currency: one as 
this currency was injected into the economy over a period of years 
and one during the French and Indian War. With respect to the first 
period, Gould (1915) asserts (without apparent contradiction else¬ 
where in the literature) that specie stocks rose along with the stock of 
paper currency. Hence it would appear that movements in the stock 
of paper currency do not give an overly inaccurate picture of move¬ 
ments in the overall stock of money. During the French and Indian 
War period (this is true of all of the colonies considered) there is also 
every reason to think that movements in the stock of specie and of 
paper currency were generally positively rather than negatively cor¬ 
related. The reason for this is as follows: During the war each of these 
colonies made large military expenditures. They were generally 
financed by the printing of money. Hence note issues rose dramat¬ 
ically (as will be seen) during this period. At the same time British 
expenditures in the colonies were large, and, in addition, the British 
government provided sterling grants to each of the colonies. Both of 
these must have had the effect of increasing specie stocks. Thus paper 
currency and specie stocks both grew during the war. 

After the war paper currency stocks contracted very rapidly. The 
reason for this is that notes were backed by future tax receipts. At the 
tune of note issue, future taxes were levied. As notes came in in 
receipt of these taxes they were destroyed. The resultant contraction 
in paper currency stocks was most likely accompanied by a contrac¬ 
tion in specie circulation. The reason is that, as is well known, at the 
end of the war there was strong sentiment in England that the col¬ 
onies should help pay for the war. The taxes that were imposed 
almost certainly led to drains of specie at the same time as paper 
currency was being retired. Hence in this instance as well it is prob¬ 
able that movements in the stock of paper money were paralleled by 
similar specie movements. Thus again our approach should provide a 
reasonably accurate picture of movements in the overall stock of 
money/’ 

*' One additional comment might be made on the question of paper currency issues 
displacing specie. Adam Smith (1776/1937) suggested that paper currency issues might 
have exactly this effect, with new issues of currency displacing equal amounts of specie 
and having no price level effects. The absence of price level effects is roughly what is 
observed for parts of the samples. Monetarists should probably not find such an out¬ 
come encouraging, if they believe that this is what actually occurred. On this point one 
might consult Mints (1945, p. 30), e.g., who states that on this issue Smith and others 
“were completely wrong in their conclusions.*' 
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III. The Sargent-Wallace View 

In contrast to the quantity theory, the Sargent-Wallace approach is to 
attempt to determine the goods value of money (inverse price level) in 
much the same way that the value of privately issued liabilities is 
determined. In particular, just as the value of privately issued 
liabilities depends on the issuer’s balance sheet, the same is true for 
government liabilities. Thus issues of money that are accompanied by 
increases in the (expected) discounted present value of the govern¬ 
ment’s revenues need not be inflationary. 

As this approach likens money to privately issued liabilities, it seems 
appropriate to attempt first to apply it to monetary systems that are 
not fiat in nature, that is, in which money is backed. However, if paper 
money is convertible on demand into commodities, then one is per¬ 
haps not surprised that its value is not directly linked to its quantity. 
Thus it seems that the colonial monetary arrangements under consid¬ 
eration, where money was (supposed to be) backed by future income 
streams but was not convertible on demand into any commodity, are 
particularly appropriate for study of this view. 

What should we expect to observe in the colonial period under the 
alternate theory, then? We should expect to observe that when money 
is carefully backed, its value (price levels, exchange rates) should not 
depend strongly on its quantity. When money is not carefully backed, 
it should depreciate in value. In fact, when incremental note issues 
that are essentially unbacked occur, the quantity theory becomes a 
special case of the Sargent-Wallace approach. 

In order to see this, it is useful to consider an analogy. Suppose that 
a firm doubles the number of its shares outstanding. What happens to 
its price per share? The answer is that more information is required. 
In a stock split we expect a halving in the price of the stock. 'This is 
analogous to the quantity theory and corresponds to the case where a 
firm increases its liabilities without a corresponding increase in its 
future (expected) stream of net revenues. On the other hand, if the 
quantity of a firm’s shares outstanding increases and there is a corre¬ 
sponding increase in its income stream, the change in stock price 
depends on the relative magnitudes of these increases. Thus whether 
or not quantity-theoretic propositions apply depends on the nature of 
backing for government liabilities. If these are poorly backed or un¬ 
backed, we expect these propositions to hold. If issues of money are 
carefully backed by increases in government assets or claims to future 
income streams, we expect these propositions to fail. 

Our approach, then, is to apply Sargent's claim (1982, pp. 45-46) 
that governments were "like a firm whose prospective receipts were its 
future tax collections. The value of the government's debt was, to a 
first approximation, equal to the present value of current and future 
government surpluses.” There are two methods by which this claim 
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will be applied to the data. We will first examine the monetary experi¬ 
ences of North and South Carolina. Both of these colonies had pe¬ 
riods in which they issued (nearly) unbacked notes. In these periods 
the quantity theory applies fairly well to the data. Each colony also 
had a “currency reform" in which paper currency became much more 
carefully backed. These reforms served to end currency depreciation. 
Moreover, inflation and currency depreciation after these reforms 
were not rekindled by extremely rapid rates of monetary growth. In 
fact, in these “postreform” periods, large increases and reductions in 
the money supplies of both colonies occurred. These had virtually no 
impact on currency values. 

There are two conclusions to be drawn from this evidence. One is 
that the quantity theory does not hold generally. The second is that 
these episodes provide evidence that the nature of backing is a deter¬ 
minant of currency values. 

This evidence on the Sargent-Wallace hypothesis is of an indirect 
nature, however. In particular, it shows only that the way in which 
money is backed affects the response of currency values to other 
economic variables. Therefore, some time will then be spent examin¬ 
ing exchange rate determination in Maryland. As indicated previ¬ 
ously, Maryland backed a large component of its note circulation with 
investments in Bank of England stock. At specified dates (1748 and 
1764) notes were to be redeemed with this sinking fund. Thus, unlike 
the notes of other colonies, Maryland notes were backed by a fund 
whose market value can be followed over time. 'This presents an op¬ 
portunity to see how the value of backing for notes affected their 
purchasing power. As will be seen, the value of backing for notes (and 
the government's track record for meeting scheduled redemptions) 
had large and significant effects on Maryland currency values. The 
quantity of notes circulating did not. Thus the Maryland experience 
provides direct evidence in favor of the Sargent-Wallace view. 


IV. The Evidence: South and North Carolina 

A. South Carolina 

South Carolina was one of the earliest colonies to experiment with 
paper money and the first to create a loan office. It was also the first 
colony (along with North Carolina) to experience large depreciations 
of its paper currency, and, finally, it was the first colony to solve this 
problem. Initially, South Carolina had issued (in 1703) £4,000 of 
notes to finance expenditures. Following the general paradigm laid 
out in Section I, at the same time these notes were issued future tax 
levies were introduced to retire the notes. The same is true for subse- 
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quent note issues (which can he followed in table 1). However, in fact 
these tax proceeds were generally diverted to other uses, so that very 
little retirement of notes was actually effected. 

In 1712 South Carolina created a loan office, with a resultant large 
increase in circulating notes. Thus by 1712 South Carolina had 
created a system of monetary arrangements that were to persist until 
1731. Very early on the note issues of the colony were made legal 
tender, and they were always acceptable in payment of taxes. For a 
brief period the colony experimented with notes that were redeem¬ 
able on demand for rice, but this arrangement was short-lived. For 
our purposes, however, there is only one important feature of South 
Carolina’s note issues: before 1731 they were poorly backed in the 
sense that as they were issued the government did not succeed in 
(significantly) raising its How of future net tax receipts.' 

Unfortunately, a general price index is not available for South 
Carolina before 1732. However, table 1 reproduces the sterling ex¬ 
change rate series reported by McCusker (1978). As can be seen, 
during the first 20 years of experience with paper currency, deprecia¬ 
tion was the rule. By the late 1720s the quantity of sterling purchas¬ 
able with £1 of South Carolina currency was barely more than 
one-fifth of its 1710 level. Moreover, quantity-theoretic kinds ol 
predictions perform reasonably well. For instance, as indicated in 
table 1, frQtn 1710 until 1720 the per capita quantity of paper cur¬ 
rency in circulation increased by a factor of slightly more than 4.5. By 
1723 the rate of exchange against sterling had increased by a factor of 
exactly 4.5. In general, in fact, for the hrst 25 years of this period 
increases in the stock of paper money tend to precede currency de¬ 
preciations. Thus the quantity theory appears to apply fairly well to 
this period, in which paper currency issues were backed onlv in the 
most nominal fashion. 

After 1727 South Carolina reversed its trend of currency deprecia¬ 
tion, maintaining exchange rates at or below their 1727 level until 
1736. In fact, during the entire colonial period South Carolina's ex¬ 
change rate against sterling was never more than 13 percent above its 
1727 level. Moreover, South Carolina succeeded in this despite con¬ 
tinued growth in its currency stock. For instance, in 1731 the out¬ 
standing note issue of the colony nearly doubled. Nevertheless, the 
exchange rate merely returned to its 1727 level, where it remained 
for 5 years following this increase. Thus currency depreciation was 
halted and not rekindled by large changes in the money stock. 

The note issue of 1731 and all successive note issues in the colony 
were not of a legal tender nature. The British government took a 

7 On this point see the discussion in Brock (ltt7S. pp. 116-‘23). 
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Date 

Paper (Currency 
Outstanding (£) 

(1) 

Pounds 
per 1.000 
Population 
(2) 

Exchange Rale 
(£ S.C. per 
i 100 Sterling) 

(3) 

Price Level 
(Average of 
1762-74 = 100) 
(4) 

170S 

4.000 


150 


1707 

12,000 


150 


170H 

14.000 


150 


1710 

14,000 

1.286 

150 


1711 

20,000 


150 


1712 

56,000 


150 


1714 



200 


171a 



300 


1711) 

90.000 




1717 



575 


171M 



500 


1720 

100.000 

5,866 

400 


1721 



533 


1722 

80.000 


580 


1723 

120,000 


675 


1724 



650 


1725 



672 


1720 



700 


1727 

106,500 


700 


1728 

106,500 


700 


1720 

106,500 


700 


1730 

106,500 

3.550 

644 


1731 

211.275 


700 


1732 



700 

79 

1733 



700 

80 

1734 



700 

108 

1735 



700 

105 

1730 



743 

96 

1737 



753 

1 17 

1738 



775 

125 

1730 



792 

84 

1740 



796 

77 

1741 



691 

97 

1742 



699 

85 

1743 



700 

70 

1741 



700 

64 

1745 



700 

46 

1740 




45 

1747 



761 

69 

1748 



762 

88 

1740 

133.045 

2.142 

725 

96 

1750 



702 

100 

1751 



700 

83 

1752 



700 

97 

1753 

152.322 


700 

1 12 

1754 

156,156 


700 

86 

1755 

221,359 

2,801 

700 

86 

1756 

31 1,816 


714 

77 

1757 

542,837 


700 

78 
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TABLE I ( Continued ) 


Date 

Paper Currency 
Outstanding (£) 

0) 

Pounds 
per 1,000 
Population 
<2 ) 

Exchange Rate 
(£ S.G. per 
£100 Sterling) 

(3) 

Price Level 
(Average ol 
1762-74 = 100) 
(4) 

1758 

595,567 


700 

86 

1759 

521,369 


700 

112 

1760 

863,827 

9,182 

700 

92 

1761 

867,744 


700 

80 

1762 



700 

77 

1763 

584,916 


717 

92 

1764 

585,246 


718 

86 

1765 

472,378 

4,327 

709 

87 

1766 

446,673 


707 

100 

1767 

344,147 


700 

94 

1768 

481,999 


700 

102 

1769 

497,654 



104 

1770 

424,154 

3,414 

717 

93 

1771 



762 

108 

1772 



679 

137 

1773 

391,391 


728 

1 16 

1774 

258.971 


700 

104 


Soi'RiEV —Column 1 Brock(1975). 1(H>-26«hi< 1 lablr 27,(<»l 2 Block(1 *>75)und L' S BuiCduol ihrCcnxis 

(p U<>8. it»l 3 McCusker <197H), pp *222-24. col 4 Taylor {m2). 


hand and refused to approve any further legal tender note issues in 
the colony. To compensate the colony resorted to the use of paper 
instruments known as public orders and tax certificates. While not 
legal tender in private transactions, these notes were accepted for 
taxes and, according to Brock (1975, p. 124), “custom nevertheless 
caused them to circulate much as the legal tender bills did." As there 
appears to have been no important difference in practice between 
public orders and lax certificates, I treat them homogeneously in what 
follows. 

I have already noted that in 1731 outstanding note issue doubled 
with no apparent effects on the exchange rate. It will now be noted 
(with reference to table 1) that after 1731 movements in the outstand¬ 
ing stock of notes generally fail to account for price level or exchange 
rate movements. For instance, from 1730 until 1749 there is a secular 
decline in the per capita quantity of notes outstanding (a decline of 40 
percent). Nevertheless, this reduction in the money supply did not 
have a salutary effect on exchange rates, and, similarly, it appears that 
the price level rose rather than declined. 

Similarly, after 1749 we see a marked increase in the per capita 
quantity of notes. From 1749 until 1755 per capita note issue in¬ 
creased 31 percent, and from 1755 until 1760 per capita note issue 
more than tripled. However, the price level in 1755 was 10 percent 
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lower than that in 1749, and the price level in 1760 was only 7 percent 
higher than in 1755. Thus this extremely large increase in per capita 
note issue (a factor of 4.3 in 11 years) was not reflected in prices 
(which fell from 1749 to 1760) or exchange rates (which also ap¬ 
preciated). 

After 1760 there was a reduction in the note circulation of South 
Carolina nearly as dramatic as the increase just considered. From 
1760 until 1770 the per capita stock of paper currency was reduced by 
63 percent. Despite this large reduction, the price level rose slightly 
and exchange rates depreciated. Thus after 1731 quantity-theoretic 
predictions appear to do quite poorly. 

Can this poor performance be accounted for in the context of stan¬ 
dard theories of money? The answer would appear to be no, for the 
following reasons. The first way one might attempt to explain the 
results above in the context of the quantity theory is to argue that 
specie flows (or changes in the quantity of some other asset) may have 
“offset” the changes in the stock of paper money noted above. This 
apjtears untenable, however, in that it requires implausibly large 
changes at certain points in time. In particular, the major injections 
and withdrawals of money after 1750 were associated with (a) French 
and Indian War deficit finance and (b) taxes levied for the retirement 
of these note issues. As noted earlier, movements in the stock of 
colonial specie almost certainly paralleled these movements as Britain 
(a) sent substantial amounts of specie to the colonies during the war 
and ( b ) levied substantial taxes on the colonies afterward. In the light 
of this, the possibility of “offsetting” changes in other components in 
the money supply seems small. 

A second way in which one might attempt to salvage the quantity 
theory is as follows. It might be supposed that colonial money de¬ 
mand could be characterized, say, by a Cagan money demand func¬ 
tion. It might also be noted that (as will be discussed below) after 1731 
monetary injections were always followed by promised future mone¬ 
tary contractions. As Sargent and Wallace (1981) have shown, it is 
possible for the effects of anticipated future monetary changes to 
dominate current movements in the money supply. Could this ac¬ 
count for the observations noted above? 

The answer is no. First, the Sargent-W'allace mechanism requires 
that anticipated future monetary changes exceed current ones in 
magnitude. As can be seen from table 1, this is not the case for the 
post-1750 period. Also, the Sargent-Wallace mechanism operates be¬ 
cause the anticipated future deflation supposedly associated with fu¬ 
ture monetary contractions increases money demand sufficiently to 
offset the effects of current increases in the money supply. However, 
as table 1 indicates, if colonials expected future deflation as a result of 
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the monetary reductions of the 1760s, they were sorely disappointed. 
Hence this is not a tenable explanation of our observations. 

Last, we might ask whether the observations above can be explained 
within the framework of conventional money demand functions (or, 
more broadly, conventional macro models). Some data that one might 
desire for this purpose are not available, in particular, data on interest 
rates. However, the period of substantial increase in the money sup¬ 
ply is a period of high wartime demand for goods and services, and 
the period of monetary contraction appears by most accounts to have 
contained a fairly standard postwar recession (see, e.g., Ernst 1973). 
In the light of these facts, the nearly insignificant inflations of 1755- 
60 and of 1760-70 seem difficult to explain absent convenient shifts 
in money demand functions. In fact, this seems generally true ol the 
post-1731 period. However, this explanation is not consistent with 
standard presentations of the quantity theory. For instance, Friedman 
and Schwartz (1963) (see their conclusion) associate the quantity 
theory with the existence of highly stable money demand functions. 
Thus this explanation will not salvage the quantity theory. 

In the light, then, of the apparent failure of the quantity theory, 
can the Sargent-Wallace view account for the observations at hand? 
'The answer is yes. As we have seen, when note issues are poorlv 
backed the quantity theory becomes a special case of this view. Thus it 
is consistent with our pre-1731 observations. After 1731 we observe 
major fluctuations in the quantity of money. For instance, we have 
seen that in 1731 the money stock doubled, yet this had no effect on 
exchange rates. Later monetary changes also had minimal effects on 
both prices and exchange rates. I will now argue that this is because 
note issues after 1731 were carefully backed by future tax receipts. 
Thus the Sargent-Wallace view accounts for the absence of effects on 
currency values. 

It is clear that the post-1750 issues were carefully backed since it 
was the future tax levies that permitted the post-1760 withdrawal of 
notes. Note issues between 1731 and 1750 were also carefully backed. 
Between 1731 and 1745, f259,282 of new issues had occurred. By 
1749 only £26,545 of these notes were still in circulation. 1 his indi¬ 
cates that note issues were well backed by future tax levies. I hus the 
statement of Brock (1975, p. 126) regarding this period appears' 
justified: “the orders of the various issues were all with reasonable 
promptness drawn in by taxes.” 

It would seem, then, that the Sargent-Wallace view that the quantity 
of currency can fluctuate widely without affecting its value (il cur¬ 
rency is carefully backed) is borne out by the experience of South 
Carolina. It will also be noted that its experience is similar to that of 
the four hyperinflation countries examined by Sargent (1982). 
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Specifically, South Carolina (as did Sargent’s four countries) ended a 
decline in the value ot its (poorly backed) currency by replacing it with 
a currency that was carefully backed. 

Finally, this experience is suggestive of the thought experiment 
conducted by Barro (1974). Specifically, colonial finance has the fea¬ 
ture that current expenditures were financed by government issue ot 
liabilities, accompanied by future tax levies. This is the finance 
scheme contrasted by Barro with current tax financing of expendi¬ 
tures. The minimal price level impact of money issues seems to bear 
out Barro’s analysis in the sense that it indicates that the timing of lax 
levies had no significant effect even on price level movements. 

H North Carolina 

In most respects, the monetary history of North Carolina parallels 
that of South Carolina. In 1712, when the Carolinas split, North 
Carolina had £4,000 of its currency in circulation. As can be seen in 
table 2, this quantity tripled the next year, and then the stock of paper 
turreniy doubled again by 1715. Thus, as was the case in South 
Carolina, North Carolina’s history was one of rapid early expansion 
of its money stock. 

In addition to this paper currency. North Carolina had a number of 
rated commodities of a legal tender nature (with a legally fixed ex¬ 
change rate into currency, which could differ from the market price 
of the commodity). 1 have already commented above on the general 
acceptability of this currency. Finally, as noted previously, “it appears 
certain that there was never any substantial amount of coin in the 
colony throughout the period" (Brock 1975, pp. 107-8). 

As can be seen from table 2, there is no evidence in favor ot the 
quantity theory arising from North Carolina’s experience. From 1715 
until 1722 (from which point the money stock was held constant until 
1729) the money supply of the colony was cut in half. Nevertheless, 
exchange rales depreciated dramatically. Then in 1729, when North 
Carolina first instituted its loan office, the money supply of the colony 
more than quadrupled. While a large depreciation did occur, the 
exchange rale never much exceeded twice its 1729 level. Moreover, it 
took 10 years for this doubling to occur. Thus both directions of 
change before 1748 (I refer here to the 1715—29 period) and relative 
magnitudes of changes are not supportive of quantity-theoretic pre¬ 
dictions for the period. 

Before 1748 it is clear from table 2 that currency values declined 
markedly in North Carolina. In 1748 a currency reform was imple¬ 
mented. A new set of notes was issued to replace those in circulation, 
with one new note to replace seven and one-half old ones. According 
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Dale 

Notes in 

Circulation 

(1) 

Pounds 
per 1,000 
Population 

(2) 

Exchange Rale 
(£ N O. per 
£ 100 Sterling) 

(3) 

1712 

4,000 



1713 

12,000 



1715 

24,000 


150 

1722 

12,000 


500 

1724 

12,000 


500 

1723 

12,000 



1729 

52,000 


500 

1731 

52,000 


650 

1734 

54,500 



1735 



720 

173(1 



700 

1737 



867 

1739 



1 ,000 

1748* 

21,350 


1,033 

1749 

21,160 



175(1 

20,647 

283 

133 

1751 

20,119 



1752 

19,028 



1753 

18,289 



1754 

57,951 


167 

1755 

56,054 

611 

mo 

175(1 

57,951 


1 HO 

1757 

68,255 



1753 

70,253 



1759 

69,512 


185 

1760 

75,806 

686 

190 

1761 

95,335 


200 

1762 

85,322 


200 

1763 



200 

1764 

73,378 


193 

1765 



200 

1766 

67,880 



1767 



173 

1768 

60,106 

334 

180 

Sourc r\, — Column l 

Block < 1975). p»> 1UH, 112, tables 75,24, sol 2 IS 

o{ the ( rnsus 1197h), p l U'N. 

•»•»<! Bnxk (1975), u>l 'l M< dusker (197H). pp 217- 
* l.mrencv reform New munei«irY uini employed 

19 



to Brock (1975), this tripled the effective money supply of the colony.' 
T hen after 1748, while exchange rates were hardly stable, they never 
exceeded their 1748 level by more than 50 percent. This constitutes a 
major success when compared with the nearly 600 percent deprecia¬ 
tion of 1715-48. How did North Carolina succeed, then, in slowing so 
dramatically its rate of currency depreciation? Further examination 
of table 2 indicates that this was not achieved by reducing rates of 
money growth'. In fact, from 1750 to 1755 the per capita money stock 
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in North Carolina more than doubled. The exchange rate depre¬ 
ciated only 20 percent. From 1750 to 1760 the money stock grew by 
142 percent in per capita terms. This occasioned only a 43 percent 
depreciation in the exchange rate. Hence for the first dozen or so 
years after the currency reform, money growth far outstripped cur¬ 
rency depreciation. 

After 1761 the money supply declined, as it did in all colonies, 
because of the retirement of notes provided for in their emission. By 
1768 the per capita money stock was only half of what it had been in 
1760. Nevertheless, North Carolina's exchange rate appreciated only 
5 percent. 

Clearly, then, the quantity theory cannot account for any of the 
North Carolina experience. How do we account for it according to the 
Sargent-Wallace approach? First, we should note that prior to 1748 
there was no meaningful sense in which North Carolina backed its 
notes. The reduction in the money supply between 1715 and 1722 
represents the only time prior to 1748 during which any notes were 
retired through taxation. Hence monetary expansions were not ac¬ 
companied by increased future government revenue streams, and we 
should not be surprised by currency depreciation. Of course, since 
the quantity theory becomes a special case of the Sargent-Wallace 
view when money is unbacked, the failure of the quantity theory is 
also a failure of this viewpoint. Naturally, though, the Sargent-Wallace 
approach does no worse for this period than the quantity theory. 

The Sargent-Wallace approach does permit an explanation for the 
relative success of the 1748 currency reform, however. We have al¬ 
ready noted that, prior to 1748, paper money was essentially un¬ 
backed. In fact, the fiscal situation in the colony was generally poor. 
According to Brock (1975, pp. 112—13), 


With the exception of the years 1715 to 1722, no bills seem 
ever to have been retired by taxation. The loan office was 
badly managed. To make matters worse, North Carolina re¬ 
mained a barter colony. Until the law of 1748 provided for 
payment of taxes in gold, silver, or bills of credit, they had 
been payable in the rated commodities. The result was, as 
successive governors complained, that the taxes were paid in 
the commodity rated highest in proportion to its actual 
value, and of that commodity each person tendered his most 
inferior stock. It is small wonder, then, that the sums raised 
in taxes for the retirement of the outstanding bills were so 
frequently negligible. But the evil did not stop here. Taxes 
levied to meet the annual cost of government proved simi¬ 
larly unproductive. The colony fell into debt; and in order to 
pay the debt, a new issue of bills was emitted. 
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Thus, it is not surprising that with poor revenue prospec ts on the pan 
of the government its liabilities were little valued. 

After 1748, as pointed out by Brock, taxes were no longer payable 
in commodities. Moreover, retirement of notes through the provision 
of taxes for this purpose was much more of a factor. I able 3 reports 
the cancellation of notes by this method after 1748. As can be seen, 
this retirement of notes occurred on a regular basis and constituted a 
generally significant fraction of total notes in circulation. Hence we 
can, at least partially, account for the success of the currency reform 
by the superior nature of the backing provided for notes aftei this 
date. 

Again, one might wonder whether this analysis has failed to pit k up 
important changes in other components of the money supply that 
account for the poor showing of the quantity theory above. Again, the 
answer would appear to be no. Some of the reasons for this have been 
previously elaborated, so we resttict ourselves to two points here. 
Consider the period of postcurrency reform. I bis is divided roughly 
in half in table 2: an initial period of large increase in note circulation 
followed by a period of large reduction. Could these movements in 
the stock of paper currency have been offset by changes in other 
components of the money supply? 

It would appear that they could not have been offset to any 
significant degree by specie movements. In particular, our earlier 
comment about the scarcity of specie in North Carolina appears to 
hold for this later period as well (see Brock 1975, p 143). In addition, 
changes in the nature of the commodity money system would lead 
one to believe that the monetary growth of the first half of this period 
is under- rather than overstated. In particular, in 175*1 North 
Carolina established a system of state warehouses and legal tender 
commodity notes. This must certainly be viewed as having the effect 
of a monetary expansion (although probably not to any great extent). 
In short, then, there is no reason to think that our focus on paper 
currency alone does any substantial injustice to the quantity theory. 

C. Remarks 

At this point a few remarks are probably in order. First, when secular 
movements in the price level fail to mirror secular movements in the 
money supply, it is typical in studies of this type (see, e.g., Friedman 
and Schwartz [1963] and their discussion of the greenback period) to 
explicitly examine movements in real output and velocity. 1 his is not 
possible for the colonial period, since there is insufficient knowledge 
of the behaviyr of real output. However, given the magnitudes of 
observed variations in real balances, it is clear that these variations 
cannot be accounted for by changes in the level of real activity. Hence 
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TABLE 3 

Note Cancellation via Taxation 
in North Carolina 


Amount 

Canceled 


Date 

<£) 

1 748 


1749 

190 

1750 

514 

175 1 

527 

1752 

1.091 

1753 

739 

1754 

338 

1755 

1,897 

1750 

1,809 

1757 

4.527 

1758 

9.544 

175*1 


1700 

5,853 

1701 

019 

1702 

10,013 

1703 


1704 

11,944 

1705 


1700 

5.498 

1707 


1708 

7,774 

Sol K( V - Sum ni 

(,«iudi.itions icfxtiU’c! t>\ Brock 


cattles L>1 LM 


velocity must have varied substantially during the colonial period. 
And, of course, such variation in velocity is inconsistent with many 
presentations of the quantity theory (see, e.g., Friedman and 
Schwartz 1963, 1982). 

However, some presentations of the quantity theory (Friedman 
1956) make velocity a stable function of some limited set of argu¬ 
ments. Most commonly these would involve a measure of the oppor¬ 
tunity cost of holding money, such as a nominal interest rate. Then 
one might argue that, if the opportunity cost of holding money 
moved appropriately over time, the variability of velocity would be 
consistent with the quantity theory. Unfortunately, there are no sys¬ 
tematic observations on the behavior of interest rates during the colo¬ 
nial period that would allow this argument to be examined explicitly. 
However, one observation suggests that the opportunity cost of hold¬ 
ing money cannot have varied too substantially during periods of 
relative exchange rate stability (such as we observe in South Carolina 
after 1727). In particular, it is known that sterling bills of exchange 
(discussed above) did not circulate at a discount when they were of 
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sufficiently short maturity. Hence, these assets, which were sterling 
denominated, did not bear interest. Moreover, il exchange rates were 
extremely stable, then the implied nominal return on these assets 
cannot have varied too greatly over the period of interest. To the 
extent that bills of exchange might be viewed as substitutes for 
money, then, this argument suggests that variations in velocity in the 
colonial period cannot be explained by major variations in the oppor¬ 
tunity cost of holding money. Hence simple changes in the 
specification of the behavior of velocity appear as if they will not 
salvage the quantity theory. 

V. The Evidence: Maryland 

We have seen that the Carolinas provide a wealth of evidence against 
the quantity theory. In addition, experience there suggests that the 
nature of backing for notes was crucial in determination of their 
value, as in both Carolinas currency depreciation was halted by the 
expedient of carefully backing notes. In this section we will delve 
more deeply into the question of how well the backing of a note issue 
accounts for its value. 

As argued above, Maryland provides a particularly appropriate 
setting in which to do this. Maryland existed with a monetary system 
based entirely on specie and a commodity money (tobacco) until 1733. 
In that year a paper currency was introduced explicitly to provide a 
medium of transaction for the (by now' significant) part of the colony 
that did not grow tobacco. Most of this currency was injected into the 
economy via classic lump-sum transfers and was backed in a way 
unique in colonial experience. I he proceeds of designated taxes were 
to be invested by agents of the colony in Bank of England stock. This 
investment was to constitute a sinking fund for the notes. Of the 
£90,000 issued at this time, 8 on the order of £60,000 was backed by 
this sinking fund. The remainder was issued through land banks. In 
addition, during the French and Indian War there were additional 
note issues to finance government deficits. These were not claims 
against the sinking fund, but rather were backed in conventional 
(colonial) fashion by future tax receipts. 

At specified dates, in 1748 and 1764, notes were to be redeemed tor 
sterling (or, more precisely, sterling bills of exchange). One-third of 
the outstanding notes were to be redeemed in 1748 and the remain¬ 
ing two-thirds in 1764. These redemptions occurred as scheduled. 
For our purposes, however, the unique feature ol this system is that 


M Actually, the £§0,000 was only authorized at this lime. It took some years for total 
circulation to approach this level. 
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Maryland notes were backed by a fund whose (current) market value 
is easily ascertainable at any point in time. Thus we may investigate 
the extent to which changes in the market value of the sinking fund 
account for exchange rate fluctuations. This seems particularly ap¬ 
propriate as, in addition to serving as a medium of exchange, these 
notes were simply claims for future delivery of sterling. Since the 
exchange rate is merely the rate at which sterling could be converted 
into paper currency, this is the sterling price of a future claim on 
sterling. We investigate how changes in the market value of this sink¬ 
ing fund affected the value of these claims. 

The format of this section is as follows. In order to illustrate the 
kind of role the market value of backing can play, a highly simplified 
model of how paper money might be priced as an asset is presented. 
Then some statistical evidence on the relative importance of the quan¬ 
tity of money, the market value of backing, and the colony’s track 
record for honoring its commitments is presented. It will be seen that 
currency values in Maryland depended entirely on the latter two fac¬ 
tors. The quantity of money is irrelevant in the determination of 
currency values. 

A. An Illustrative Model 

It will be recalled that, among their other functions, notes in Mary¬ 
land were claims to future delivery of sterling (bills of exchange). In 
this section we attempt to see to what extent empirically an extremely 
simple asset pricing model can account for movements in Maryland 
currency values. The model presented is oversimplified, in fact, for 
brevity of presentation. 

What, then, would be the primary factors in any model attempting 
to explain asset pricing? Obviously the most important factors would 
be the kinds of promised future payoffs to which assets are a claim 
and the probabilities that these promises will be honored (or, in a 
contingent claims setting, that the relevant states will occur). Thus it is 
necessary to discuss the kinds of promises made by Maryland and how 
these promises were (in all likelihood) perceived by the residents of 
the colony. 

The promise of Maryland to redeem a third of its notes for sterling 
in 1748 and the remaining two-thirds in 1764 (at the rate of four 
Maryland pounds for three pounds sterling) was (at least on its face) 
uncontingent. However, so were the promises of South Carolina to 
retire note issues via taxation before 1731. Before Maryland ever had 
recourse to a paper currency, then, most of the colonies had estab¬ 
lished by long experience that these promised redemptions or retire¬ 
ments were, in fact, contingent. On what were they contingent, then? 
First, funds earmarked for retirement of notes were often appropri- 
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ated in the face of government expenditure needs. The larger the 
market value of the sinking fund, then, the greater the ability of the 
government to redeem notes and meet its additional revenue needs 
(if any) from the proceeds of the sinking fund. Second, when funds 
provided for the retirement of notes proved insufficient, this relit e- 
ment was typically postponed. Hence the larger the market value of 
the sinking fund, the smaller the probability that retirement would 
not occur as scheduled. 

I now introduce some notation. Let MV, denote the market value of 
the sinking fund at date t, and let T denote the announced redemp¬ 
tion date. !l In the light of previous remarks, assume redemption will 
actually occur at T only if MV 7 meets or exceeds some critical value, R. 
Let F r [x\MV,, MV 1 , . . .] be the conditional probability that MV, =s x 
evaluated at /, where in principle there could be a number of variables 
on which this probability might depend. To simplify matters, I as¬ 
sume that the only relevant conditioning variables are the historical 
market values of the sinking fund. In addition, I make the plausible 
assumption that T r [x|y*, y*,~ t , . . .) =s /M-vly,, >’1 t. • • •] for all ,v and for 
any sequences {y,}, {y*} such that v* 3 y,. 

Finally, suppose MV T < R. Then a simple assumption is that re¬ 
demption will occur at the first date for which the value of the sinking 

fund is at least R. Let F r r , [x\MV r +, t < R, MVm- > < R . MV„ 

MV,~ |, . . .] be the conditional probability at t that MV r <, =£ x, given 
that MVn-,-s < /?, 1 r t, and given the realized sequence of 
market values. 

Given these notational conventions, let us proceed with a simple 
model in which currency values are determined as if they were con¬ 
ventional asset prices. Let e, denote the sterling value of a Maryland 
pound note. Let r be (an exogenously given) discount factor. Ul Then 
an absence of (expected) arbitrage opportunities implies 

e, = ?' ■; t < T - L (1) 

1 + r 

where ,e, + t is the date I expected value of p,+ |. At date T — 1 redemp¬ 
tion will occur next period if MV r s R. Hence tor t 2* T — 1. 

e, = (tTt) {[1 ~ + F r ((V, + (2) 

where F r (t) s F T [R\MV„ MV,- 1 ,.. .].c, + 1 is the exchange rate at / + 1 
if redemption has not yet occurred, and % is the promised rate at 
which Maryland pounds were to be converted into sterling. 


1 I assume a single redemption date for simplicity. 
Data on interest rates are unavailable. 
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Solving (1) and (2) forward, we obtain 


e, = 



i 

l?T- 1 . 


(3) 


i 


= (1 - /-'/(Olf-y—y) + /•>(/)(! - 


.75 


(1 + r) 


(4) 


+ + iO) 


tfr +■ •> 

(1 + rf 


with an obvious notation in (4). When repeated substitutions are ap¬ 
plied to (4), the latter term vanishes if, for instance, /'/ +j, is bounded 
for all k and if r > 0. Then ,e-/ _ ] will be, through the relevant condi¬ 
tional probability distributions, a function of the sequence {MV,_,}*« () . 
Lei us denote this as ,r 7 t — t)i(MV,, MV,- i, . . .]. Then (3) and (4) 
imply 


r, - (yLj)- VfMV„ MV, 


(5) 


Moreovet, it is apparent that i|i(-) is monotone nondecreasing in 
MV,t 3*0. 

Our method of apjjiying this model is as follows. From equations 
(3) and (4), our model predicts that the ratio of e, to the redemption 
rate ( Vi) is related positively to market values of the sinking fund. 
Therefote we estimate below the equation 

par, — «,) + (i\M, + a-MVI, + uj), + e„ (6) 

where pur, = .75c, — 1 , M, is the quantity of notes in circulation at l , 
and D, is a dummy taking the values 

D, =0; l < 1748 

(7) 

D, = 1; t 3= 1748. 

The role of this variable is that Maryland did in fact honor its commit¬ 
ment to redeem notes in 1748. Until this point there was no reason in 
particular for colonists to believe that this commitment would be hon¬ 
ored . 1 1 According to the view that money is priced in the same way as 
any other asset, a record of its issuer’s honoring promises should 
enhance the value of this currency. The variable MVI, is the value of 
the sinking fund at t divided by its value in 1764. Finally, as should be 
clear from (3) and (4), par ,—the percentage the exchange rate at t is 
above the relevant redemption rate—is an appropriate dependent 


" This is certainly the case, as only a minority oi other colonies had regularly and 
strictly honored their commitments with respect to retirement of paper currency. 
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variable. According to the Sargent-Wallace view, then, we expect a. L < 
0, an < 0, and at = 0. 

Obviously, this is hardly a sophisticated approach to pricing money 
symmetrically with other assets. In principle far more sophisticated 
asset pricing models along the lines of Hansen and Singleton (1982) 
or Mehra and Prescott (1983) could be applied and directly imple¬ 
mented empirically. However, as can be seen from table 4, the num¬ 
ber of available observations is small. Hence the best we can hope for 
is a fairly general indication of whether the Sargent-Wallace hy¬ 
pothesis accounts for the data. In fact, this limited number of observa¬ 
tions accounts for the absence of lagged variables in (6). 

H. The Evidence 

The evidence presented in this section is derived from application of 
ordinary least squares to (6). This procedure should yield consistent 
parameter estimates for the following reason. Clearly the only right- 
hand-side variables whose exogeneity is suspect are the money supply 
and the market value of the sinking fund. With respect to the money 
supply, all authorized changes in it either were made in 1733, before 
any observations on exchange rates were available, or were a result of 
wartime deficit finance. This latter component might appear partially 
endogenous, as changes in exchange rates may have altered the nomi¬ 
nal value of government expenditures. However, as inspection of 
table 4 will confirm, exchange rates were fairly stable during the 
French and Indian War (1756—63), so that this factor should not have 
been operative. 

The market value of the sinking fund at any date was the sum of 
three factors: lax proceeds from an excise tax on tobacco exports 
invested in Bank of England stock; dividends paid on the sinking 
fund, which were reinvested in the fund; and capital gains or losses 
on the stock held. Certainly we may take Bank of England stock prices 
and dividends paid on this stock as unaffected by events in Maryland. 
In principle, tax proceeds on tobacco exports could have been af¬ 
fected by Maryland exchange rate variation to the extent that it in¬ 
fluenced tobacco exports. However, this is a question on which some 
evidence can be produced. In particular, from the results of Sims 
(1972), it is known that there exists a model in which MVI is strictly 
economelrically exogenous with respect to par only if MVI is not 
Granger caused by par. In table 5 tests of Granger causality are pre¬ 
sented that show that, at a very high marginal significance level, MVI 
is not Granger caused by par. Thus we cannot reject that this neces¬ 
sary condition for strict exogeneity is satisfied. In addition, table 5 
reports tests of whether par Granger causes EXS (exports from Mary- 
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TABLF. 5 
Exogeneity I'ests 
1 

A A/17, = a + V b, A/17, , + Y c,par, , 

» ' 1 1 - I 


Marginal 

Significance 

(f s h'(s, 24 — 0 Level 

0) I 1 57 .57 

(11) I 2 1 10 55 


7 

B /LY.S, — <7 + ^ b,EXS, , + ^ c t f)m, , 
> 1 / 1 


Marginal 

Significance 

7 \ F(s, 20 - n) Level 

(ui) 1 1 I 15 27 

(iv) 2 2 8 K .45 


S«n K< in — 1 he smiues l»*r .ill data other than ex|>oin are rr|Miiirrl in the lexi Stnimli exports at e hum l S 
lltncaii ol ilie (.cumin < pp 1177 78 

Noir S 111111 i 1 .il v xi.it isi us loi llu-leniession ei|ii.uions aic as lollosvs (i) • IV 1)-W - 1 VI, (7(13) = 882 

(si^iuluarur lexel = 7*0 ( 11 ) H' = 21.1VW = 1 73, (7( 13) = II -10 (MgiiifitaiKC le\el =■ 'i8),(n»)W^ = 44, 1)-W = 

I 88, (7(121 - ’» <»*) (sigiillii ante lew I = *M). f|s)W‘ - 44, D-W ■ 1*14 Q(ll) ■ 4 5*1 Isi^mfuaiue lesel = h r >) 


land to Scotland). These would be almost entirely tobacco exports. We 
use Scottish rather than Knglish exports because the U.S. Bureau of 
the Census (1976) does not separate Virginia from Maryland in its 
data on exports to Britain. As can be seen, at a marginal significance 
level of .43, we cannot reject the hypothesis that exports are not 
caused by exchange rates. Hence the suspect component ol MVI does 
not appeal to he correlated with the error term in (6). Therefore, we 
ntay proceed with our Ol.S estimation of (6) with some degree of 
confidence. 

The data used are as follows. Data on exchange rates are taken 
from McCusker (1978), who reports the number of Maryland pounds 
required to purchase a sterling bill of exchange. Data on the money 
supply are taken from Brock (1975). Incidentally, it should be noted 
that Brock reports that it is not possible to tell exactly how fast au¬ 
thorized wartime monetary increases were actually spent or exactly 
how fast they were retired. Hence the numbers reported for 1756—62 
have some errors, with the earlier and later numbers probably some¬ 
what overstating the true money supply. Thus the usual caveats re¬ 
garding error-laden variables apply. Finally, data on the sterling value 
of the sinking fund appear in the Scharf Collection of the Maryland 
Historical Society, which contains the surviving periodic reports of 
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the London trustees for the sinking fund. 12 Bank of England stock 
prices are reported by Mirowski (1981). All these data are reproduced 
in table 4. 

In order to get a feel for the magnitudes by which different factors 
influenced currency values, three different versions of (6) are re¬ 
ported. First, to gauge the extent to which monetary changes affected 
currency values, (6) was run with the constraints « 2 = a ; , = 0 imposed. 
The resulting equation, with /-statistics in parentheses, was 

par, = .310 + (1 x Kr (i )Af„ 

(.896) (.263) (8) 

R 2 = .003, D-W = .411. Q( 13) = 48.33; 

D-W is the Durbin-Watson statistic, and Q(13) is the value of the Box- 
Pierce (1970) serial correlation test statistic with 13 degrees of free¬ 
dom. The marginal significance level of the Q-statistic is 6 x 10' <1 , 
and for the coefficient a 1 it is .80. 

Clearly the quantity of money alone has no impact on currency 
values. Its coefficient is extremely small, and we cannot reject the 
hypothesis that it is zero even at extremely high significance levels. 
Finally, of course, (8) performs extremely poorly. 

Next, (6) was run subject to the constraint as = 0. The resulting 
equation is 

par , = .84 + (1 X 10 " 7 )Af, - .91AW„ 

(2.04) (2 x 10 ' 2 ) (2.82) (9) 

R 2 = .364, D-W = 1.21, Q(8) = 2.50. 

This equation is much better behaved than (8). I he marginal 
significance level of the Q-statistic is .96, so this suggests no serial 
correlation. The coefficient on the money stock continues to be ex¬ 
tremely small and highly insignificant. And finally, the coefficient on 
the (index of the) market value of the sinking fund is large, is sig¬ 
nificant at the 1 percent level, and has its theoretically predicted 
sign. In particular, increases in the market value of the backing for 
notes result in an appreciation in the value of Maryland currency. 

Finally, as noted above, Maryland's track record for redeeming 
notes as promised may significantly affect the value of its currency. 
The result of running (6) is 

par, = .90 - (9 x 10' 7 )M, - .37 MVI, - .43D„ 

(2.92) (.26) (1.30) (3.49) (10) 

R‘ z = .671, D-W = 2.19, Q(8) = 4.59. 


lt 1 he relevant data were kindly provided to me by Jacob Price. 
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Several things should be noted about this equation. First, it is surpris¬ 
ingly successful. For instance, Hodrick (1978) estimates an exchange 
rate equation between Britain and the United States over the period 
1972-75 that contains no lagged terms. His equation has five explan¬ 
atory variables involving relative money supplies, output levels, and 
interest rates. In addition, like equation (10), it is estimated on the 
basis of relatively few (36) observations. Hodrick reports an R J of .73. 
He also estimates a similar regression for the U.S.-German exchange 
rate with six explanatory variables and 28 observations. An R 1 of .66 is 
reported for this regression. Equation (10) has similar explanatory 
power, without the benefit of contemporaneous income or interest 
rate data. Hence it would appear that the Sargent-Wallace hypothesis 
applied to Maryland has good explanatory power. 

Second, the marginal significance level of the Q-statistic is ,80. This, 
along with the Durbin-Watson statistic, gives no suggestion of serially 
correlated residuals. 

Third, as predicted by the Sargent-Wallace hypothesis, < 0 and 
a$ < 0. Thus a history of honoring promised redemptions and a large 
accumulated backing for notes both enhance their value. Moreover, 
the marginal significance level of a* is 4 x IO -3 , so the history of the 
colony in honoring promised redemption dates is highly significant. 
The marginal significance level of the market value coefficient is only 
.22. While this is not particularly high, the value of the sinking fund is 
far more significant than the quantity of money. In addition, the 
coefficient on the market value term is fairly large, albeit not very 
precisely estimated. Hence it is not clear that one should conclude 
that it is insignificant. 

Finally, the coefficient on money has a marginal significance level of 
.80 and has the wrong sign (according to the quantity theory). Hence 
it is clearly the case that the nature of promises backing notes, and not 
the quantity of notes, determines their value. 

Again, in defense of the quantity theory, one might ask whether 
some important component of the money supply is omitted in equa¬ 
tions (8)—(10). The answer is no. The exchange rate used in these 
equations is the rate between Maryland paper currency and sterling. 
Specie in Maryland as well as tobacco money circulated at market- 
determined rates with paper currency. Hence it would be inappropri¬ 
ate to aggregate these with paper currency. 

VI. Conclusions 

The current study encompasses a 70-year period and three colonies 
with somewhat but not completely similar monetary arrangements. In 
all of this experience, only that of South Carolina before 1731 is 
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supportive of quantity-theoretic propositions. In this instance, u is 
also true that the quantity theory is a special case of the Sargent- 
Wallace approach. However, in contrast to the performance of the 
quantity theory, the Sargent-Wallace approach generally accounts 
well for the successes of the Carolina currency reforms and for ex¬ 
change rate behavior in Maryland. 

Moreover, I have argued that the success of the one approach and 
the failure of the other cannot be accounted for by omissions in mon¬ 
etary figures. Nor can they be accounted for by the effects of antici¬ 
pated future changes in money stocks that tended to accompany cur¬ 
rent changes. 

How can one attempt to rescue standard approaches to monetary 
theory, then, in which money is treated asymmetrically from other 
assets? One suggestion is that standard money demand functions may 
have characterized colonial currency-holding behavior but that these 
demand functions shifted at convenient points in time. In addition to 
having no empirical content, this view is one highly detrimental to the 
quantity theory. For instance, Friedman and Schwartz (1963) attempt 
to explain U.S. monetary history in the century following the Cavil 
War on the basis of a stable demand function for money. Then a 
demonstration that money demand functions were highly unstable 
for a 70-year period in the colonies would be greatly at variance with 
their approach. 

A second suggestion is that the economy of colonial North America 
was sufficiently primitive as not to be a “monetized economy.” Cer¬ 
tainly no one would apply this claim to Europe of the same period. 
Yet existing indications are that money-income ratios were higher in 
the (British) North American colonies than in any European country 
other than Britain itself. Thus such a suggestion would appear to be 
without basis in fact. 

We are left, then, with the conclusion that there is a long period of 
history and a number of locations to which the quantity theory of 
money does not apply. 13 Other views of money that do not treat 
money differently from other assets do appear successful in ex¬ 
plaining this period. Thus it would appear that these views deserve 
greater claim on the attention of monetary economists than they ap¬ 
pear to have received. 
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General Equilibrium Tax Incidence 
under Imperfect Competition: 

A Quantity-setting Supergame Analysis 


Carl Davidson and Lawrence W. Martin 
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The incidence of various taxes is analyzed in a model with competi¬ 
tive and oligopolistic sectors. Friedman's “grim’’ trigger strategies 
support collusion with firms producing output to maximize joint 
profits subject to the constraint that no firm desires to cheat. The 
sustainable level of output depends on the capitalized value of future 
retaliation; therefore, tax-induced changes in the net return to capi¬ 
tal (the discount rate) affect the output mix through their impact on 
the sustainability of collusion. This “collusive pricing effect" is 
isolated and analyzed. One result is that a general factor tax on 
capital (but not labor) is shifted. 


I. Introduction 

In the appendix to his seminal article on tax incidence, Harberger 
(1962) argued that the mechanism that determines the incidence of 
the corporate tax in a model with a monopolistic corporate sector 
“differs only in minute detail” from the mechanism at work in the 
competitive case. Several authors have observed that Harberger's 
treatment of imperfect competition was less than adequate, and thus 
there have been attempts to extend the basic two-sector model to 
allow for imperfect competition. Most notable are the studies by An¬ 
derson and Ballentine (1976) and Atkinson and Stiglitz (1980), in 
which one sector is assumed to be monopolistically competitive. In 
each of these papers the authors assume that the firms act in a Cour- 
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not-Nash manner. Both papers provide significant insight into the 
effects of taxation in the presence of substantial levels of imperfect 
competition; however, the Cournot-Nash assumption directs atten¬ 
tion away from the analysis of the consequences of group interdepen¬ 
dence and collusion, factors that are of vital importance in oligopo¬ 
listic industries when the level of concentration is high. In this paper 
we provide an alternative model that focuses on these factors and how 
they alter the standard results obtained in the tax incidence literature. 

We model a two-sector economy with perfectly competitive and 
oligopolistic sectors. Rather than rule out collusion in the oligopolistic 
sector, we assume that tacit collusion is the norm. In modeling this 
sector we make use of the fundamental insight of the supergame 
literature (see Aumann 1959; Friedman 1971, 1977; Rubinstein 
1979, 1980; Green 1980; Radner 1980; Abreu 1983; Porter 1983; 
Brock and Scheinkman, in press): if a market situation is repeated 
infinitely, the industry may settle at a collusive price even if the firms 
are not explicitly colluding. This implicit agreement is enforceable if 
firms deviating from the agreement suffer future losses (due to retali¬ 
ation) that outweigh any immediate gains from cheating. Such models 
of oligopoly are especially interesting in a general equilibrium setting 
because of the central role played by the net return to capital, whii h is 
the discount rate for firms that maximize the present value of profits. 

To see this role, observe that it is in the collective interest of firms in 
the oligopolistic sector to maintain an aggregate output as close as 
possible to that of a monopoly. In choosing industry output, however, 
they must beware of the individual firm's incentive to violate the 
implicit agreement, an incentive that varies inversely with industry 
output. Thus the chosen output must be sufficiently large that the 
individual gains from cheating on the agreement do not dominate 
the capitalized value of the losses due to retaliation. It follows that the 
sustainable industry output depends critically on the discount rate, 
for higher rates diminish the present force of future retaliation. In 
order to keep firms from cheating when the discount rate rises, the 
output level must be increased. Taxes that alter the net return to 
capital in general equilibrium will consequently ulleci the sustainable 
level of industry output and hence its market price. 

The paper is divided into three additional sections. In Section 11 we 
outline a partial equilibrium mode! of oligopoly that captures the 
spirit of the supergame approach yet yields conditions appropriate 
for the kind of comparative statics used in general equilibrium tax 
ncidence. In the next section we imbed our model of oligopoly into 
he standard two-sector general equilibrium economy with taxes and 
ormally isolate the collusive pricing effect. A surprising result is that 
i general factor tax on capital is shifted. The reason lies in the dual 
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role played by the return to capital: the gross return allocates the 
fixed stock of capital, and the net return determines the sustainable 
level of collusive output. In the standard model the net return to 
capital falls to offset the tax, leaving unchanged the gro.M-of-tax factor 
price and therefore the equilibrium values of all other prices. In our 
model, on the other hand, this fall in the net return to capital increases 
the present value of retaliation by the cartel and thus allows it to 
sustain a lower level of industry output. This change in the economy’s 
mix of output then induces changes in factor prices. 

We also consider excise and partial factor taxes. In addition to the 
usual output and factor substitution effects, our model implies the 
existence of the collusive pricing effect. Section IV’ contains a few 
concluding remarks. 

II. Partial Equilibrium in the Oligopolistic Sector 

We model the oligopolistic sector as a repeated game (supergame) 
among,V identical firms, each producing a single homogeneous good 
(X) under constant costs. Firms collude (tacitly or explicitly) to restrict 
output and enforce the chosen output levels through threat of retalia¬ 
tion against “cheaters” (firms that exceed their production quotas). 
Specifically, we consider agreements enforced by the “grim” trigger 
strategies; that is, all firms produce their share of the cartel quantity 
unless some firm c heats. If any cheating occurs at time t, the cartel 
dissolves, and each firm reverts permanently to the output level it 
produces m the static Nash equilibrium. We assume that any cheating 
is detected costlessly. 

The potential cheater compares the current higher profits due to 
cheating and the f uture lower profits brought about by the dissolution 
of the cartel. More formally, let Tr‘ , ‘ denote the profit earned when 
cheating; it' the profit per firm if all abide by the production plan of 
the cartel; and it" the profit earned by each firm in the static Nash 
equilibrium. The net gains from cheating (Z) are then 

Z = (If'* - IT') - ± (IT* - IT"). (1) 

I he first term in parentheses is the extra profit earned at time t by 
exceeding the production quota; the second is the present value of all 
future lower profits due to retaliation. The discount rate is r. Cheat¬ 
ing occurs when Z > 0. 

In order to examine Z in more detail let Px(Q, <*>) denote the twice 
dif ferentiable inverse demand for good X, with Q denoting industry 
output and u> a vector of shift parameters. If we let c denote the 
constant unit cost and Q, the output of an individual firm, then profit 
per hrm is 
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'IT. = [Px(Q, <•>, C, r) - c]Q,. 
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It is well known that existence problems trouble models of imperfect 
competition unless further restrictions are placed on demand condi¬ 
tions (see Roberts and Sonnenschein 1977). Hence we assume that 
industry profits defined by P( ) and c are single peaked. This assump¬ 
tion ensures that it’' is well defined. Note that once P() and c are 
given, it" is determined and is independent of the cartel’s actions. 
Cheating profits are obtained by maximizing tt, over Q„ assuming all 
other firms are producing the cartel output level Q r . Thus cheating 
output, cheating profits, and Z may be written as a function of Q r . 

Turning to the cartel’s problem, we assume that it chooses the 
a jrg r egate production level and its distribution among firms to secure 
as large a joint profit as is consistent with the absence of cheating. 
Because all firms are identical, however, the distribution of output 
quotas will be symmetrical. Thus the cartel problem reduces to that of 
choosing output per firm, Q,, from the set of sustainable outputs 
defined as O » {Q r : Z =£ 0, Q, ^ 0}. In this setting H may be inter¬ 
preted as a constraint on the -ahility of the cartel to restrict output. 
Formally, the cartel solves 


maximize tt' 

Q, <2> 

subject to Q, 6fl. 

Let Q* denote the solution to (2). Because the number of in ms is 
fixed, our assumption that profit functions are single peaked implies 
that tt c has a unique maximum, say tt " 1 for Q r - Q m . H 2(Qm) «£ , t en 
n* = n ; that is, joint profit maximization is sustainable. 1 his case is 
equivalent to having a monopolist in this sector and since tax inci¬ 
dence in the presence of a monopolistic sector has been siudiet y 
Anderson and Ballentine (1976) and Atkinson and Suglitz (1980), 
will assume Z(Q m ) > 0. In this case, Q* is the smallest level of output 
greater than Q„, that satisfies Z(Q.) = 0. Collusion exists, but 
constrained collusion. 

In general, writing Z as a function of Q, and the other paramete 
of the model, we obtain the solution to (2) by solving equation (3) for 

Q ,: 2 

Z = Z(Qr. «, r, r) = 0. (3) 

Variable Q r may then be substituted into the inverse demand function 


1 See also Brock and Scheinkman (in press), who adopt this app.oach for price 

Martin (.984, .1 is shown that a sohn.on to C- (SI exists and 

that the implicit function theorem holds for (3). 
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to obtain the cartel’s price as a function of the basic parameters of the 
model: 


P x = Px(**> c, r). (4) 

In the next section we imbed (4) in a general equilibrium model. 

Of particular interest in (4) is the role of r, the discount rate. For a 
cost-minimizing firm, the discount rate will be equated to the net 
return to capital. In the absence of taxes r also represents the cost of 
capital and enters (4) through c. But r also enters separately as the 
third argument. This separate effect indicates the impact of the price 
of capital on the pricing decision of the cartel. A rise in r diminishes 
the present force of future retaliation, but it also increases the cost of 
capital. If the first effect dominates, then such a change will require 
an increase in output (fall in price) in order to keep firms from cheat¬ 
ing on the agreement. Unlike the competitive model (or other models 
of imperfect competition) in which an increase in the cost of capital 
can result only in higher prices, in this supergame there is a possibility 
that the prices of output and capital may move in opposite directions. 
Formally, differentiating (4), we get 

dP x 8P x {dc\ dP x /c , 

~Tr -sr(—) + ^r- (5) 

I'he positive first term in (5) captures the “cost-push” effect of the 
change in r; the negative second term, the impact of the greater 
inducement to cheat on the original quantity assignments. We term 
this latter effect the "collusive pricing” effect, and in the next section 
we investigate how this additional force alters the standard results on 
lax incidence. 

III. The Oligopolistic Sector 
in General Equilibrium 

In this section we imbed the model of oligopoly into the standard two- 
sector model of general equilibrium. The basic model is well known, 
and we follow the presentation and notation of Atkinson and Stiglitz 
(1980). There are two goods, X and Y, and each is produced under 
constant returns. Perfect competition prevails in the Y sector, but the 
X sector is oligopolistic. Both sectors employ capital and labor, which 
are fixed in supply and fully mobile among firms and between sectors. 
Capital is infinitely lived and nondepreciable and is traded in a flour¬ 
ishing rental market. All firms, whether competitive or members of a 
cartel, are price takers in the capital market. This implies that the 
opportunity cost of using capital is the gross-of-tax equilibrium rental 
price, for both firms that own capital and those that rent. In the 
context of this model capital is putty-putty. There are no sunk costs. 
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We let q, - the gross-of-tax output price of good /, r, = the unit cost 
function for good ;, w and r = the net returns to labor and capital, 
respectively, T ; = one plus the ad valorem output tax on good 7’ = 
one plus the partial factor tax on input i used in the production of 
good j, and M = aggregate income. 

The innovation in our model is the pricing in the oligopolistic sec¬ 
tor, which is governed by an equation analogous to (4). In the model 
the vector of shift parameters includes the price of good Y and in¬ 
come; that is, co = (q y , M ). Then, assuming Z(Q„,) > 0, the cartel’s 
gross price is given by 


q x = q x (.q y , M, c x T x , r). 

(6) 

Assuming perfect competition in the Y sector, the price of Y must 
be equal to the marginal cost of production: 

</ v = c v {iuT/. v , rT Ky )T r 

(7) 

Aggregate demands for the two products are 


X = X(q x . <?,, M), 

(8) 

Y = Y(q x , q x , M), 

(9) 

and since in equilibrium all income is spent, 


M = q x X + i? V E. 

(10) 


Assuming fixed supplies of the two inputs, labor ( L) and capital (K), 
the full-employment equations are 

O x X + C/yY = /.(>, (11) 

c Kx X + c Ky Y = Ko- (1 ^) 

Here the c,, are the partial derivatives of thejth unit cost function with 
respect to the ith factor and represent the tth input requirement per 
unit of output of the /th good; U, and A’ 0 represent the fixed stock ol 
labor and capital, respectively. 

Equations (6)—{12) comprise the general equilibrium model. 
Choosing w as the numeraire and dropping equation (.)) from t e 
model leaves six equations in six unknowns. The standard approach is 
to differentiate this system of equations totally and solve the resulting 
“equations of change." Unfortunately, differentiation ol (b) yie s, in 
general, quite an unwieldy expression involving relative s i is in t te 
demand for X at the cartel, Nash, and cheating points. T us or t e 
remainder of this paper we focus on a specific case, we assume t at 
the representative consumer has a utility function of t e orm 

U(X, Y) = (1 - a)l» Y + a In X. 
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In this case the inverse demand curve for X is 


<h 


OLiVf 

~x~ 


(13) 


so that a is the budget share of good X. Equation (6) can now be 
simplified since (13) does not directly involve the price of good Y. 

Carrying out the cartel’s maximization problem (see eq. [2]) yields 
(assuming r > UN so that the monopoly output is not sustainable) 


a M(N - 1 )(rN - l) 2 
cJ\N~(rN + l) 2 


(14) 


Substituting into (13) we obtain the counterpart of (4) and (6):' 1 


<h = 


N (rN + 1 )‘~* cT 
(N - 1 )(nV - l) 2 ' '■ 


(15) 


Finally, differentiating (7) and (15) and subtracting we have 

<h ~ </v = -(6* + V)? + ('A - f y ) + (UJ'k, - flA-vf^v. (16) 

wheie, as usual, the circumflex (') indicates proportional change (e.g., 
.V = dX/X), 0 s " = 0 / v — ft Av , ft/, = (the share of costs going to 

labor in industry /), 0 A - ; s rr^jT/c, (capital’s share in industry j), and B* 
measures the value of factor intensity. The second term in equation 
(lb), 'F = 4nV/[(rA0 2 — 1], captures the effect of changes in the dis¬ 
count rate on the ability of the cartel to enforce its output restriction. 
Note that 'F > 0 for all r > 1 IN, which is the present case. 

Differentiation of the demand and full-employment conditions 
yields 

X - Y = - (c/ v - c/ v ), (17) 

k*(X — Y) = — (« v cr v + « v oy)r - a y o x f v - « v cr v 7' Y , (18) 

where > 0 is the elasticity of substitution in the production of the 
/th good, A* = k /x — A Ay , \,j = c y /t 0 (the share of factor i employed in 
industry;), and a, = B/gA,, + ft/jA*, > 0. Obviously, A,^ + A, v = A Ax + 
A Ay = ft/, + ft A - Y = e /v + 0 Ay = 1 and A*ft* > 0. 


' It may seem that there is some legerdemain here in that M, c, and r are not lixed 
in general equilibrium. Should the cartel not anticipate the general equilibrium effects 
of its decisions and not act in accord with (4)? At a purely formal level this is true, 
however, following Harberger (1962, p. 239), we do not view the imperfectly compete 
live sector as a gigantic cartel but rather as comprising many cartels, each “small” 
relative to the economy but including hrms that are large relative to the industry. 
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Equations (16)-(18) constitute a three-equation general equilib¬ 
rium model with three unknowns: X ~ Y, q x - ^ and w ’ now 
consider the incidence of three types of taxes within this model- gen¬ 
eral factor taxes, output taxes, and partial factor taxes. In order to 
keep the discussion as simple as possible while focusing on what dis¬ 
tinguishes our model from the standard, we will analyze the introduc¬ 
tion of small taxes into a model with no existing taxes. 1 The only 
existing distortion is that implied by cartel pricing. Although this 
implies that income is not constant (price exceeds cost in the X sector), 
the homotheticity of our product demands obviates any consideration 
of income effects (see Atkinson and Stiglitz 1980, pp. 182-83). Any 
tax revenue is refunded lump sum to consumers or spent according 
to equations (8) and (9). 

A. General Factor Taxes 

A general factor tax in the standard two-sector mode! is a tax on a 
factor in fixed supply and is borne entirely by the factor owners. That 
is, net returns fall to offset the lax exactly. Because the gross-of-tax 
factor price remains unchanged, the remainder of the economy is 
insulated from any tax-induced distortion. 5 

In our model, while this pattern remains for a general tax on labor/’ 
a portion of the general factor tax on capital may be shifted or capital 
owners may see their net returns fall by more than the tax payments. 
Furthermore, the mix of outputs, relative product prices, anti returns 
to the untaxed labor will change with the lax on capital. The reason is 
that the net-of-tax price of capital (r) serves two functions: it allocates 
the fixed supply of capital between industries, and further it measures 
time preference. In this latter role, the level of r determines the pres¬ 
ent value of the punishment for cheating on the X sector cartel. A fall 
in r reduces the inducement to cheat, thereby allowing the cartel to 
sustain a higher price for its output. 


1 Musgrave (1959, pp. 211 — 13) refers to this concept as differential tax incidence 
with equal money yield. We compare the incidence ot various taxes with lump-sum 
taxes ol equal monetary yield. The government will not generally he able to purchase 
the same bundle of real commodities Sec Shovcn and W'halley (1977) on various 
conceptions of equal yield taxation. 

’ Feldstetn (1977) shows that this need not hold in an intertemporal model. Ills hxcil 
factor is land, which is a perfect substitute for capital as a store of wealth I he tax 
reduces the value of the stock of land, leading to an oflscinng increase in capital 
holdings. In our model no such adjustment in the rapilaJ stock is possible, yet the tax is 
shifted. 

" The identity^tcT/ = 1 implies w = — t'l and that a general tax on labot is home 
entirely by workers. From (16)—(18) no other equilibrium values are affected. 
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In equations (16)—(18) set T x — T y = 0 and Tkx — ^A'v — Tk and 
solve for the effects of changes in T K on the three endogenous vari¬ 
ables: 


— ffl* = (0*X* + a x cr x + a y <Ty)f K , 

(19a) 

(X - Y)D* = - (a x o x + ayOyW't K , 

(19b) 

(q x - (j y )D* = (a x a x + a y <j y )^f K ; 

(19c) 


D* = a x cr x + cLytjy + X*(0* + '!') must be positive for stability. 7 

Equation (19a) confirms that wlr increases with an increase in 'l' K , 
but the absolute value of the elasticity is less or greater than one as X* 
is positive or negative. If X* > 0, then theX industry is labor intensive. 
The tax-induced fall in r reduces cartel output, which concomitantly 
increases the relative demand for capital, mitigating the fall in its 
relative price. Some of the tax is then shifted toward labor. On the 
other hand, if the cartel sector is capital intensive (X* < 0), then the 
restriction in its output further increases wlr. Capital bears more than 
100 percent of the tax. s 

B. Selective Output Taxes 

To examine the impact of a selective output tax on the X sector set f Kx 
= 7\ y = Ty = 0 in (I6)-(18) and solve for (X - Y), (e) x - q y ), and f. We 
obtain 

ID* = X*f„ (20a) 

(X - Y)D* = ~(a x o x + a y o y )f x , (20b) 

(<7* ~ <iv) l) * ~ («*<** + a y Oy)T x . (20c) 

Relative factor returns move inversely to the physical factor intensity 
of the taxed sector. The taxed sector contracts, and consumer prices 
for good X rise relatively (note that the price the firms receive may 
rise or fall depending on the strength of the collusive pricing effect). 
These results are consistent with those of the standard competitive 
model in sign, but the magnitude of the effects may be significantly 
dif ferent if the collusive pricing effect is large. The collusive pricing 
term, 'E, affects D* in the same direction as the sign of the measure of 
physical factor intensity, X*. If the taxed sector is labor intensive, the 
collusive pricing effect leads to a larger D* and consequently smaller 
elasticities. Relative factor returns, outputs, and output prices are less 


7 For a detailed discussion of the stability properties of this model see Davidson and 
Martin (1984). 

B For comparison with the standard two-sector tax model, in the competitive case 'P 
= 0 and therefore D * = A*0* + a,o» + 0,(7,. From (19a) the elasticity of r with respect 
to the tax equals - 1 and the right-hand sides of (19b) and (19c) vanish. 
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sector as a supergame and assume that its output is the solution to a 
constrained maximization problem, where the constraint is the incen¬ 
tive of an individual firm to “cheat” on the oligopoly production plan, 
f his approach allows comparative statics to be carried out in the 
traditional way. Further, we imbed our partial equilibrium oligopoly 
model into the standard two-sector model of general equilibrium. To 
our knowledge all of the supeigame literature is partial equilibrium, 
and ours is the first general equilibrium treatment. 

In our analysis of tax incidence we isolate a hitherto unnoticed 
collusive pricing effect that captures the impact of changes in the net 
return to capital on the output of the oligopoly. Because current 
output is enforced through threat of future retaliation, changes in the 
net return to capital, which is the discount rate of the firm, will affect 
the ability of the oligopoly to restrict output below Nash equilibrium 
levels. 

One particular result stands out. A general factor lax on capital is 
shifted even though the factor is in fixed supply. In our model the 
grav.s-of-tax price of capital af fects tiie cost of production, but the net- 
of-tax price affects the sustainable level ol industry output. The in¬ 
troduction of a tax on capital must change either the net or gross 
pi ice (or both), and therefore the impact of the lax “spills over" out of 
tlie capital market and af fects the equilibrium values of all prices. 

Finally, while we have considered a specific model of oligopoly, our 
results do not depend cm the choice ol this particular punishment 
sc heme. For example, Abreu (1983) suggests a two-phase punishment 
scheme that, when the monopoly output is not sustainable, credibly 
suppotts more collusive output than the grim trigger strategies. In 
addition, in Abreu’s model the punishment lasts only one period. Our 
results generalize to Abreu’s model. <J The reason for this is that any 
punishment aimed at a potential cheater occurs in the future regard¬ 
less of the number of periods it takes. The punishment is thus dis¬ 
counted, whereas the gains f rom cheating are immediate. An increase 
in the rate of interest always makes the capitalized value of the losses 
due to retaliation seem smaller and makes collusion more difficult to 
sustain. 'File collusive output must then expand to reduce the tempta¬ 
tion to cheat. This interest rate—induced change in output is our 
collusive pricing effect. 
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Uncovering Financial Market Expectations 
of Inflation 


James D. Hamilton 

Unwerwty of Viiginui 


I his paper seeks to uncover market expectations of inflation front 
the joint dynamics of nominal interest rates and observed inflation. I 
postulate (1) tfie existence of a stable veuoi autoregression in unob¬ 
served variables and (2) efficient financial markets. 1 he estimation 
piocedure tequires no fuither structural assumptions, the econ¬ 
ometrician need not observe all the information used by agents in 
forming lorecasts or all the variables that influence real interest 
rates. 1 find no empirical evidence that economic recessions tire cor- 
i elated with inflation forecast errors. However, postwat recessions 
are associated with ex ante teal interest rates that are twice the post¬ 
war average. 


I. Introduction 

Changes in expected inflation undeniably influence nominal interest 
rales. This observation has motivated a number of innovative efforts 
to uncover financial market expectations of inflation from movements 
in nominal interest rates, such as Dwyer (1981), Cessler (1981), Mish¬ 
kin (1981), Faina and Gibbons (1982), and Frankel (1982). This study 
shares with these earlier researchers a desire to uncover information 
about inflationary expectations while imposing as few struc tural as¬ 
sumptions as possible. 

This paper assumes that (l) a trivariate low-order vector autore¬ 
gression exists summarizing the dynamics of ex ante real interest 
rates, expectations of inflation, and realizations of inflation and (2) 
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agents' forecasts of inflation are rational. Agents are assumed to pos¬ 
sess more information about the determinants of inflation than the 
econometrician. These assumptions are shown to be sufficient to gen¬ 
erate historical estimates of agents’ expectations of inflation based 
solely on observation of nominal interest rates and realized inflation 
rates. 

Section II provides a formal statement of assumptions and 
methods, with additional details in the Appendix. Section 111 sum¬ 
marizes empirical estimates, while Section IV examines the implica¬ 
tions of the estimated historical series on expected inflation for eco¬ 
nomic theories about financial markets, business cycles, and monetary 
policy. Section V briefly concludes. 

II. Assumptions and Methods 

Let us define it, = actual percentage change in price level between 
dates t and / + Lit? = market expectation of tt, based on information 
available at i, >■, = market forecast error tt, - tt*,'), i, - noinma 1 
return on one-period bond purchased at date t and redeemed at t + 
1, and r, — ex ante real interest rate {= i, - tt',). I assume that ihe 
following representations exist: 

r, - 4- 4>(L)r, +• «1 <oTt', + + W-)^i + €„, (1) 

it? = k-> + a(l.)r, + 0(E) tt? + -y(E) it, + e 2 „ (2) 

with 

£(€,,) = E(ei,r,- ; ) = £(e,,= Eit,,^, -,) = <> for; > I (!') 

£(€..„) = £(e 2 ,T, -,) = f(e'>X-,) = E^v,.,) = 0 for / a 1. <2') 

For any symbol y, I let v(L) denote Vi/. 1 +■ y-J- 2 + ... 4 y / ,L / '. with Ux, 
= x,. r Using the definitions of t„ €|„ and e 2 „ conditions (1') and (2') 
imply 

,, ,) = £(e,^, ,) = t) (3) 

for i, k - 1,2 and for all y — 1 • 

In imposing the orthogonality conditions (1 ) and (2 ) 1 am in el feet 
postulating that the joint dynamics relating ex ante real interest rates 
to expectations of inflation are stable and relatively simple. Stability 
allows us to define the parameters in equations (1) and (2) as popula¬ 
tion regression coefficients, in which case (1') and (2') are guaiantecd 

to be true by construction for / = 1,2. p■ Moreover, il these 

dynamics are simple enough that they can be summarized by a low- 
order vettor autoregression, (1') and (2’) will also hold for / — p + 1. 
p + 2. 
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Note that neither equation (1) nor equation (2) is intended to sum¬ 
marize all tite variables that influence real interest rates or expected 
inflation. Instead, these equations simply represent the statistical pro- 
jec lions of r, and -n'i on a strict subset of the variables by which they are 
actually determined. 

The model used by Kama and Gibbons (1982) is a special case of my 
equation (1) in which the researchers imposed the restriction 4>i — 1 
and set all other parameters to zero. Gessler (1981) estimated 4>, from 
the data but again took all other parameters in (1) to be zero. Despite 
these restrictions, these researchers still needed to make use ol the 
lull orthogonality conditions as stated in my equation (.'!) for all/ 2: 1. 1 
1 lie same orthogonality conditions (0) were imposed by Mishkin 
(1981), though for his purposes they were less critical to the validity of 
his approach." Thus, while I impose a restriction implicit in this ear¬ 
lier research, it is perhaps more plausible in the present context in 
that we have permitted richer dynamics lor real interest rates and 
their tclation to expected inflation than were allowed by earlier inves¬ 
tigators. 

In addition to assuming that the joint dynamics ol ex ante real 
interest t ales and expected inflation are stable and relatively simple. 1 
follow earlier researchers in imposing the structural assumption of 
ellicient markets. I assume that agents’ information sets available at 
time t include at least (but are not limited to) /, ; , -tt,' and it, | for j 
— 0, 1,2.... (these ot course imply inclusion of r,, >/->>■ ■ •)• Thus 


E{e,tt,-, ,) = = /•.>,<-,) = 0 for / = 0, 1, 2, . 


Again appealing to the definition of e,.,- ; this means that T(r,e,,,_. ; ) = 0 
for / — 0, 1,2.Summarizing, 


£(e 1; , f- 21 , C,)'(e6y„ e s ) 


it{ 0 0 

0 (To 0 

0 0 07 


for t = .s 


= 0 


for I s. 


(4) 


Substituting the definitions r, = i, — tt',' and tt'/ = tt, — e, into equa- 
rions (1) and (2) yields 

it ~ tt, = k t + <fr(L)(i, - tt,) + t|r 0 ir, + (t|r + £)(/-)it, + t<i„ (5) 
TT, = k-, + a(L)(i, - tt,) + (fl + y)(L) tt, + (6) 


1 Hie parameter tt, , in Kama and Gilliams (19H'Z) corresponds to mv let m f i, 
whereas their tt, is my r,. In oidei lor the error term tt, , + n, — n, , in Kama and 
Gibbons's e<|. (li) to follow an MA(I) process as claimed, it is necessary to assume that 
fc(e,,e,, ) = ,) = 0 for / 2: 1 The same restrictions are implicit in the state-space 

motivation behind Gessler’s (I9H1) approach, 

■ Kite paiamcler u, used by Mishkin (1981) cot icsponds to my term whereas Itis e, 
cortesponds to niy r. Thus Ins condition (tt) that is(u,u,,,) = 0 and E(u, , ,€,) = 0 is again 
equivalent to my condition (3). 
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where we have defined 

u u ~ “(1 + + (4> ~ <!>)(/. )f, + €|„ 

« 2 ( = e, + (a. - (3)(/-)e, + e 2f . 

In simpler versions of the model presented above, Faina and Gibbons 
and Gessler showed that if ex ante real rates (r,) follow a univariate 
AR(1), then ex post real rates ( i , — it,) must follow a univariate 
ARMA(1, 1),' which assumption allowed them to uncover the struc¬ 
tural parameters governing the dynamics of ex ante real rates from 
the observed behavior of ex post rates. The same principle can be 
applied here. Condition (4) implies that (u u , u 2 ,) obey a bivariate 
MA(p) process; thus equations (5) and ( 6 ) specify that ex post real 
rates (;, - -it,) and inflation rates (it,) follow a bivariate ARMA(/;, p ) 
process. In particular, we might hope to recover the 4 p parameters («J>, 
a, t|> + g, P + y) from the autoregressive coefficients of this bivariate 
process.' Attempting to estimate these structural parameters in this 
way has the following economic interpretation. Suppose p equals 4 
quarters, and imagine trying to forecast the time path of ex post real 
rates for a period beginning I year in the future based solely on 
realisations of ex post rates and inflation observed as of the present. 
The time path of these forecasts will follow some difference equation 
as the forecasts extend farther and farther into the future. Our proce¬ 
dure amounts to estimating the parameters governing the dynamics 
of actual ex ante variables («t>, a. «j> + g, p + 7 ) front the parameters 
of this difference equation. The heart of this estimation procedure is 
thus the assumption that the observed dynamics of ex post real inter¬ 
est rates are in this long-run sense the same as those for an unob¬ 
served process for ex ante rates. Obviously, it is very crucial that a 
simple, low-order autoregression of the form of ( 1 ) and ( 2 ) accurately 
describe the (unobserved) dynamics of r, and it',. 

Having uncovered these 4 p structural parameters from the auto¬ 
regressive coefficients of the ARMA(/>, p) representation for ( 1 , - it,, 
it,), what information about parameters remains in the moving aver¬ 
age terms? These error terms represent the combined influence of 
the structural innovations e,,. « 2 , and the inflation forecast errors e,. 
However, from the definitions of up, u 2 , it is dear that under our 
assumptions all of the dynamics of these moving average teims come 


v Tliis is the content of et|. (7) in 1-am.i and Gibbons (111821 with 3 — 1 and the ex post 
rate = TB, 1 - /,. 

1 1 have dehbeiately simplified this discussion to make the intuition as cleat as possi¬ 
ble. In fact, a leduced-loi ill ARMA process will not umover the AR pat amcieis m the 
form of (5)^>ecause ip, is correlated with tr, I he cot ret t argument (used in the App ) 
proceeds by hrsl substituting t(ii, limes ftp (ti) lot ikipr, in e<| (5). and the ip parameleis 
actually uncovered Irom the vector ARM At/’, p) one would estimate hum the data art* 

(<J> + ili 0 a, a, i|» + 4 + «l<ti(P + 7). P + Yl 
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from the inflation forecast error e t . Thus, one might hope to estimate 
the 2 p + 1 parameters {<}> — if/, a — of) from the 4 p unrestricted 
lagged moving average coefficients and (erf, tr|. *K>) from the contem¬ 
poraneous variance matrix. 

Calculating the precise conditions under which one can do this is a 
straightforward exercise, an outline of which is sketched in the Ap¬ 
pendix. The Appendix further details the maximum likelihood al¬ 
gorithm used to estimate these structural parameters from the data. 

Given knowledge of the structural parameters in (1) and (2), it is 
then possible to think of it/ as part of the unobserved state vector of a 
standard signal-noise extraction problem similar to that in Kama and 
Gibbons (1981) and Burmeister and Wall (1982). Inference about tt',' is 
drawn from the observed movements in (t„ it,). The identifying re¬ 
strictions that allow such inference are (1) the assumption that tt, 
differs from the unobserved tt? by a white-noise process and (2) the 
assumption that the dynamics of r, are simpler (in the statistical sense 
described above) than those of ( i , — it,). 

A state-space model describes the evolution of an observed vector y, 
in terms of observed inputs z, and an unobserved state vector x,: 

x, m = Fx, + Gz, -I- w,|, 
y, = Hx, + Dz,_ i + v„ 
where, for j = 1,2. 

E( w,x,'- ; ) = 0, = 0, £(w,y;_ ; ) = 0, E(w r w!) = Q, 

E(v,x; + , -,) = 0. E(y, z,T,) = 0. E(v, y,'_ ; ) = 0, £(v,v,’) = R. 

By substitution of the definition r, = i, — rf, into equations (1) anti (2), 
our model can be written in this general form, with 

X/ = «, Tt;' -/ 

2, = (t,, t, - t, . • • , h-p+ I, TT,, TT, _ |, . . . , IT, - p t- |, 1 ) 

y, = (c tt,)' 

W,+ 1 = (€2,,-), fl. 0)' 

V, = (ei„ «*,)' 

(Pi - «.) (p2 - «2) 

1 0 

F = 0 1 


(P/, - OLp) 0 
0 0 

0 0 


0 


0 


1 


0 



FINANCIAL MARKET EXPECTATIONS 


l a 29 


a, a 2 
0 0 


~ti >2 
0 0 0 


• yp k 


0 0 ... 0 00 


4>i <t>_* ■ • ■ 4^ €1 €2 • • • ip 

0 0 ... 0 0 0 ... 0 0 


(1 + <K0 Mi - 4>i) (<1<2 ~ 4>2) • ■ • (*k ~ 4>/») 


a\ 0 0 0 ... 0 
0 0 0 0 ... 0 


0 0 0 0 


a\ 0 


If one knew the value of the parameter vector q = (a, p. y, <l>. k. 
i|io, cr 1 , 02 , cr,)' the matrices F, G, H, D. Q, R would he known and the 
Kalman filter (see, e.g., Anderson and Moore 1979) could he used to 
arrive at optimal estimates of x, conditional on observations y,, 

. ... The filter starts f rom a Bayesian prior distribution for 

the initial state vector ’ x<i ~ A'(x 0 , Pu). which is levised by the data 
recursively according to 

K, + , = <FP,F' + Q)H'[H(FP,F + Q)H’ + R|“ 

x,., = Fx, + Gz, + K,, ,[y,. , - H(Fx, + Gz,) - Dz,j. (7) 

P, +1 = (I - K,+ ,H)(FP,F' + Q). 

Notice that even if we knew the parameter sector q (and thus the 
values for F, G, H, D, Q, R) with certainty, the economctiit ian would 
still be unable to separate perfectly inflation fotecast errors (/',) from 

'Since my empirical work emploss p - 4 lags wall data st.utmg m J_ 1,K ,||RI 
itseti stalls alter l„ = 1951.1 1 specified fc,„ = (*,„■ JC. o - - - idriim's 

with the vc.y difii.se poor P,„ = 20 ■ I. lht < J ' ’> 
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the underlying structural innovations (ei,, €•>,). We can, however, make 
an educated guess as to how much of a given quarter’s change in 
interest rates and inflation to attribute to agents’ inflation forecasts. 
T he first element of the estimated state vector, x,(l|q), represents the 
best such guess, and the (1, 1) element of the weighting matrix, 
P,( 1, 1 |q), is the variance of this guess around the true value x,(l), that 
is, around the true expectations held by agents. 

In practice, we are faced not just with this filter uncertainty, which 
would league us even if we knew the true value of q, hut also with 
parameter uncertainty as to the value of q itself. We are interested in 
characterizing the total uncertainty from these two sources, Recall 
that for any random variables Y and X (see, e.g., I.indgren 1976, 

p. 1 :>»<)) 

var(F) = /.'v var(V)X) + vai\ /"J(F|,Y). 

Suppose that the maximum likelihood ptocedure described in the 
Appendix yields an estimated value q for the parameter vector q with 
associated asymptotic variance A. Adopt the (asymptotic) Bayesian 
perspective that the parameter q is itself a random variable, distrib¬ 
uted X’(q. A). Then our estimate -ft',' of the inflation rate anticipated by 
agents v has variance 

vai (fr 1 ,) = L q var(rr',|q) + var q /ff-fr'Aq). (8) 

1 he first term is assoc i.ited with the filter uncertainty discussed above, 
for vai (Tr',|q) = P,( 1, 1 |q). Consider therefore Monte Carlo generation 
of 200 different vectors q all drawn tioni a /V(q. A) distribution. For 
each Monte Carlo draw repeat the Kalman filter iteration (7) on I = 0, 

1. T. For each I the average value of P,(l, l|q) across these draws 

corresponds to the (liter uncertainty £ q var(Tf',|q) characterizing the 
econometrician's filter uncertainty as to agents' true beliefs about 
what inflation was going to he for that date. 

1 he second term in (8) can be interpreted as parameter uncer¬ 
tainty. Different values for q would cause the Kalman filter to pro¬ 
duce different values for the best guess as to what agents were actually 
thinking at each date: E(n‘, |q) = x,(l|q). The Monte Carlo generation 
of difleient values of q drawn from a (V(q, A) distribution can indi¬ 
cate how much this series would vary for equally plausible parameter 
vectors. Specifically, lor each dale ( the sample variance of fr’, across 
Monte Carlo draws can provide an estimate of var q /i(7r‘,'|q). 

III. Results 

Column 1 of table 1 reports 3-month U.S. Treasury bill annualized 
rates. The first entry, i\, corresponds to the February 1950 3-month 
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rate, ? 2 to the May 1950 rate, and so on, measured in units of 100 basis 
points. The adjacent inflation sequence begins with tt,, the percentage 
change in the seasonally adjusted implicit GNP deflator in the second 
quarter of 1950 over the first quarter, again expressed in annualized 
rates in units of 100 basis points. 

Using these data, 1 obtained the following maximum likelihood 
estimates (standard errors in parentheses): 


- 1.59 + ,35?7 

r- [ ~ 

.64r, 

_ 2 + .46/7 

- t - . 17 r. 

+ .79tt; 

(.81) (.23) 

(.23) 

(.25) 

(.25) 

(.37) 

•4- 1 .46-tt,''.- i + 

.54t r) 

1 — 2 ^ 

- -7In;-., - 

- .IIttU, 

- .83ir, _ i 

(.25) 

(.36) 


(.29) 

(-26) 

(.19) 

— .81tt,_., — 

.92 it, 

- H “ 

.2977,-, + 

£lr, * 1 ; 

= .0040; 

(.22) 

(24) 


(.15) 


(1.72) 

1.00 + .40r, 

t + • 

1_ 

a — . 1f>r, . 

, + .14),- 

. - .2577,U 

(.44) (.13) 

(• 

15) 

(.16) 

(.15) 

(.14) 

— .50wf_a - 

.39 it; 

■ ,t + 

.043irJ'- i 

+ .48it, . i 

+ .47tt, 

(.19) 

(.18) 


(.17) 

(.062) 

(.081) 

+ .(>0tt,_, + 

.22tt, 

-1 + 

elv, a? - 

= .44; 


(.077) 

(.073) 


(.087) 



■tt, 

= tt', 


= 1.56. 



(.070) 


These parameter values were then used in the Kalman filter (eqq. [7]) 
to generate econometric estimates of the rate of inflation actually 
anticipated by agents. This series is reported in column 5 of table 1. 
Note in the light of note 5 that the first five entries of columns - and .1 
are identical by construction; econometric inference about it', actually 
begins in 1951:11. 

The Kalman filter produced this series for *', by taking a time- 
varying weighted average of present and past values ot t, and it, and 
past values of*f. For our parameter values, the steady-state values tor 
these weights turned out to be 

ff,T, = .88ti - .622*; - .658*7 t - -140*f _ a - .0:U*U :l 

+ .557?, + , - .19-ti, + -354t, , - -256?, -a (12) 

+ .095?, -s + .0000036 tt, +, + .460ir, 

* + .454-it, -i + .5 14 tt, _ 2 + . 160-tt. ,. 



TABLE] 

Nominal Interest Rates. Actlal Inflation Rates, Econometric Estimates of Rates of Inflation Anticipated by Agents, 

and Variances of Econometric Estimates 
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1 he diligent reader can confirm that af ter 1960 the entries of column 
3 of table 1 can be generated by equation (12). This equation could 
easily be used to extend this series to postsample data. 

In interpreting these results, recall that tt, represents the actual 
inflation rate, tt', the rate of inflation anticipated by agents, and tt',’ our 
econometric estimate of what the agents were anticipating. Neither 
equation (10) nor (12) should be construed as the algorithm used by 
agents in forming these anticipations. Instead, (10) is simply the result 
one would get if these anticipations were to be regressed on lagged 
values of the real rate, actual inflation, and expected inflation. Like¬ 
wise, equation (12) merely summarizes the updating rule used by the 
eionometruian to uncover what rates of inflation the bond market was 
anticipating. Note that this rule allows the econometrician to utilize 
realized ex post inflation rates -tt, to draw inferences about the ex ante 
forecasts of agents, tt); agents, by contrast, do not observe tt, until date 
/ + 1. However, our best parameter estimates suggest that in fact 
virtually no additional information about what agents were histori¬ 
cally anticipating could be gleaned from ex post inflation data; essen¬ 
tially all of what agents anticipate about inflation seems to be reflected 
in current nominal interest rates and the past history of inflation and 
interest rates. 

We further must be careful to distinguish between two questions 
about standard et rot s. 

1. /low good a job did agents do at fmeuistmg inflation; that is, how dose is 
it', to -tt,? The answer to this question is given by cr* = /•.'(tt', - tt,) 2 . The 
maximum likelihood estimate for this parameter was reported in 
equation (11) to be 2.43. Thus, 95 percent of the time agents should 
have been no more than 312 basis points off in foiecasting annualized 
inflation rates 1 quarter ahead. 

2. How good a job did the econometrician do at uncovering what agents 
were thinking; that is, how dose is tt' to tt',? I argued in equation (8) above 
that this magnitude, var(fr',) = E(itj - tt',) 2 , can be decomposed into 
two separate sources of uncertainty. T he first is the filter uncertainty 
that arises from the impossibility inherent in any signal-noise extrac¬ 
tion problem of distinguishing the fundamental innovations (t|„ 
from the inflation forecast errors e,. A measure of this uncertainty is 
provided by the first term in equation (8), which estimated variance is 
reported in column 4 of table 1. 

The second concern in question 2 above arises from uncertainty 
about the true values of the parameters in equations (9)—(11). The 
estimate of this variance (the second term in eq. [8]) is listed in column 
5 of table 1. Despite high standard errors for the parameter estimates 
in equations (9) and (10), the estimates for tt, itself seem relatively 
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robust to this uncertainty about parameters. High cross-correlations 
between parameter estimates account tor this relative precision. For 
example, the (1, 4) element of the F matrix is (p,, - oci)- Individually, 
the estimates of these parameters have standard errors of. 17 and . 15, 
larger than the estimated values of either of the parameters alone. 
However, given the correlation between pf and &, of .88, the imputed 
value F(1,4) has a standard error of .08, half as large as that of either 
parameter from which it is constructed, '['bus, even though individual 
parameters are estimated quite poorly, acceptable estimates of n? may 
still be obtained. 

The total econometric uncertainty is the sum of columns 4 and 5, 
whose square root gives a standard error for rt r , around tr) (col. 6). 
Typically our series would differ from the actual expectations of 
agents with a standard error of 50 basis points. 


IV. Economic Implications 

A. Efficient Markets 

The imputed forecast error f, 3 it, - was checked for the follow¬ 
ing properties. 


Unbiasedness 

During the period 1951:11-198*2:1, e, has mean 0.03, taking positive 
values in 61 quarters and negative in the remaining 63. Certainly V, is 
an unbiased predictor of it, and exhibits no tendency to systematically 
overpredict or underpredict inflation. 


Rationality 

We find corr(p„ e, ,) = -.01. Notice further that the number of 
negative errors observed prior to a positive error should have a 
geometric distribution with mean 1 and variance 2 (and lake on the 
value zero with probability ‘/a). Of the 60 runs between positive fore¬ 
cast errors, 31 turn out length zero, as expected, and also the mean 
run length is 0.90 (with a standard error of .183). Finally, the lu st row 
of table 2 tests the restriction c ( = c 2 = O = O = * n ^ ie ordinary 
least squares regression 

e, = Co + <"i*, -1 + < 2*1 -a + < vv,_ i + C\x, i + v,, (13) 

for x, = e, and / = 1952:1-1982:1. Clearly there is every indication 
that forecast errors are independent of lagged forecast errois. 
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TABLE 2 

Tests of Whether Estimated Inflation Forecast Errors Were Predictable 
on the Basis of Four Lagged Values of Ot her Explanatory Variables 


Explanatory Variable 

Sample Period 

^-Statistic 

5 Percent 
Critical 
Value 

Inflation forecast 

errors 

1952.1-1982:1 

T(4,116) = .35 

2.29 

Realized inflation 

rates 


F(4,116)= 27 

” 

3-month T-bills 

" 

F(4,l16) = 37 

" 

5-year T-bonds 

" 

f(4,l 16) = .69 

" 

Real GNP 

" 

/■(4.II6) = .92 

" 

Ml (seasonally 

unadjusted) 

1960:1-1982:1 

F(4,84) = 86 

2.33 


We assume that agents also use lagged interest rates and lagged 
realized inflation rates in forming their expectations. Rows 2 and 3 in 
table 2 present regression results for x, = t r, and x, = 1 , in equation 
(13). We again find no indication that either of these series is of any 
use in forecasting e,. 

Perhaps a more interesting test is provided by seeing whether other 
variables that should have been known to agents but were not used in 
our econometric estimation of ttJ are useful in predicting tt, - -fr',. We 
find in table 2 that our series in fact could not have been predicted on 
the basis of lagged long-term interest rates, real (INP, or the money 
supply. 1 ’ 


Efficiency 

Our technique assumes that agents use at least lagged it, and lagged t, 
in forming tt j. If agents use this and other information efficiently, the 
estimated forecast variance attributed to agents should be less than 
that for a simple regression of tt, + 1 on t,, t,_ 1, i,_ a , (,-3, it,, tt,- i, tt,-2> 
tt,_ 3 , and a constant. The standard error for the latter regression, 
estimated 1950:1-1982:1, is 2.15 versus a value of 1.56 imputed by 
our method to the standard forecast error actually made by agents. 


b Notice that this test could have led to rejection even it the estimation proteduce 
were correct and even if market expectations were rational. For, even if (rr, - rr,') were 
uncorielated with x, there is no guarantee that (tt; - irp would also be uncorrelated 
with x, 1 if x, — 1 were not used in the construction of ft;. 
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The difference is a measure of the value of information known to the 
financial markets but not to the bivariate autoregression in forecast¬ 
ing inflation. 

B. Monetary Policy and Business Cycles 

Most macroeconomists agree that monetary contractions have been a 
contributing factor in some postwar recessions. One school of thought 
(e.g., Lucas 1973) argues that unanticipated disinflation plays an im¬ 
portant role in this process. If true, we should see periods of recession 
associated with negative values of it, - it',. Of the 29 quarters between 
1951:1 and 1982:1 characterized by recession, 7 inflation came out 
lower than expected in 14 quarters. The average error during these 
29 quarters was —.062. Under the null hypothesis that these quarters 
are no different from any others in the kinds of forecast errors made 
by agents, this average should have mean zero and standard deviation 
1.56/V29 = .29. There is thus every indication that finantial markets 
make no more errors forecasting inflation during recessions than the) 
do at any other time. 

An alternative tradition argues that monetary contractions contrib¬ 
ute to recessions by increasing real interest rates. My values for this 
series are plotted in figure I. It indeed turns out that for these esti¬ 
mates recessions are characterized by average real interest tales of 
1.13, more than twice the average value of 0.53 seen outside of reces¬ 
sions. 

The individual pattern, however, seems to be quite different for 
different recessions. Blinder (1979). among others, has argued that 
monetary tightness contributed to the 1973-75 recession. But 
whether one uses the classical or Keynesian measure of tightness, litis 
claim is unsupported by our series—inflation seems to have tome out 
substantially higher than expected even as real interest tates were 
significantly lower than usual. The primary causes of this recession 
could not have been monetary. By contrast, both measures agree that 
monetary policy was unusually tight during the most recent recession. 
Of the unprecedented 900 basis point spread between nominal inter¬ 
est rates and ex post inflation rates in 1981:1V, my estimates suggest 
that 340 basis points might be attributed to the forecast errors of 
agents and the remaining T>60 basis points to unusually high ex ante 


7 Tim analysis uses llie N B K R dates 1953 III-195-1 II, I9;>7 111 I9aH M. 
I960 11-19611, 1969.1V—197(1 IV. 197.'! TV-1975 1. 1980:1 -1980 III. and 19X| III 
1982:1. 
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real interest rates (to be contrasted with the average ex ante rate of 67 
basis points over the complete postwar period). 

V. Conclusions 

The central assumption of this investigation has been that the dynam¬ 
ics governing ex ante real interest rates are simpler than those of ex 
post real interest rates. Standard signal-noise extraction theory could 
then be applied to arrive at econometric estimates of agents' expecta¬ 
tions of inflation based on the time-series properties of nominal inter¬ 
est rates and realized inflation. My estimated series for expected infla¬ 
tion is consistent with the assertion that financial market expectations 
of inflation have historically been unbiased and rational and make 
efficient use of information available to agents. The market’s stan¬ 
dard error in forecasting annualized inflation rates 1 quarter ahead 
appears to be on the order of 150 basis points. 

This series further suggests that periods of recession are not associ¬ 
ated with negative inflation forecast errors, as would be predicted by 
equilibrium business cycle theory if these recessions were thought to 
be caused by demand shocks. If monetary contractions play a role in 
recessions, this might instead be attributed to the fact that recessions 
have been characterized by ex ante real interest rates that are twice as 
high on average as in normal periods. 
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Additional details of the following discussion are orovided 
(1984) ' 


in 


Hamilton 


I Aenlificaliim 


From theorem E of Hannan (1971) as interpreted by Preston and Wall 
(1975), the autoregressive coefficients of the observed process (5) and (6) of (t, 
~ if if W/) can be estimated separately from the moving average let ms pro¬ 
vided that there are no common roots between the autoregressive (Al) and 
moving average (A2) determinant polynomials: 


[1 - 4>(41(1 - PU) - 7<z)] + a(z)[-t|»„ - ij»(r) - ?U)1 = I). (Al) 
o^tr? + (T'jtr^[ 1 + a(z) - (i(z)l[l + a(z“‘) - P(z" *)J 
+ crjo-;! 1 + + tfi(z) - <t»(z)](I + tf»o + i)j(z "') - 4.(3-')] = (). 

As an illusti ative example, suppose a = 0 and ii>(r*) = 1 so z* is a root of 
(A 1). Then z* would also be a root of (A2) only if trfal + o^cr'^l - Plz*)]! 1 - 
fi(z* "')] just happened to equal -aHtr)(t|;„ + tlt(z*))[tl»d + d/(z* ')], which 
would require the elements of «J> to hold a particular, exact numerical t elation 
to those ol P, * and the si/e /if the variances. F.xcept in unusual knife-edge 
cases of this sort, the parametei s [<t> + a, ip + 4 + iMP + 7 ), p + 7 ] can 
he inferred from the autoregiessive coefficients relating i , and it, 

The conditions under which the rest of the structural parameters can he 
uncovered I rout the covariance structure of the moving average terms can be 
calculated as follows. Let #,,(*) = = (4>* - >Ji*) + (1 + ihi'(°M ~ Pt) 

for k = 1 . jj (zero otherwise); 6 * - (a* - p*) for k = 1 , . , p and A* = 1 

for A = 0 (zero otherwise). Construct f- R ' 1, * 1 —* R'' 1 ’ ~ 1 mapping (of, a, b) into 

L?n(;). gt /(/). ffM/by =• 1. 2 . P - 1; g\At>). gaaf/t)]. Taking p = 4 for 

illustration, it can he shown (by elementary row and column operations) that 
the jacobian of f( ) has full rank if and only if the following matrix has full 
rank (3): 


-«,«•> - - fi| + ajO- bo) 

(b { - />.,) b x (bi - b>) 

- rijUi + n x ft^(bs - b.) 

/>.. 


(cij + flu ¥ n ,)ri 
0 

- h a ,;<} 

1 ¥ />j + b-, ¥ 6 , + b, 


, - n^r:, tisti 1 + 

I + - by - bj> 

-« J , - + ti \(1 ft 

l + 6 , 


where a s <i\ + -f a\ + and b = —■ 1 + b\ — b t Again, ihis condition will 
be satisfied unless bv some coincidence the different elements of gi. t|>, P, a 
happen to obey particular exact numeric al relations to one another. One such 
set of knife-edge restrictions perhaps deserves mention. Suppose expected 
inflation had no effect whatever on nominal interest rates. Reierting bark to 
equation (1), this would imply 4>o = - 1 and vp, = 4>, for; =1,2, . ,p But 

from the definition of a this in turn would mean a, = 0 foi j = 1,2. f>. in 

which case the matrix above fails to have full rank. In older lor our proce¬ 
dure to recover from observations on i, and it,, expectations must exert an 
influence on nominal interest rates over and above that exerted bv past ie.il- 
izations of inflation. Moreover, equations (1) and (2) must accuiatelv describe 
the dynamics of ex ante magnitudes for some low value of />• 

Given^i, b, cr?, observe that the parameters ih>. tr), tfi can be uniquely 
recovered from g, ,(0). g,,(<». and g a2 (0); At, *2 are of course retovered from 
the means of the ARMA process for (/, - it,, t r,) From these functions ol the 
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structural parameters, the complete parameter vector q = (a, fj, 7 , <{»> «|t, k, 
<l»o. ffj, (J 2 , a,)' can be readily recovered. 

Estimation 

Note that for a Gaussian distribution y = (i 1+/ „ n\ + p, h+p, 112 + p, . 
tir+p)’ h as conditional log likelihood 

L(y\ t|, iti, . . . , ip. it p ) = c - Vx{2Tl og(a?) + log[det(2)]} 

ul~‘u 
2 ^ ' 


■ 1 1 +/j. 


(A3) 


where 

u ~ "2,1 +/;. “*,2+/<. “2,2+/» ■■■< u *.I +/>. “27 + />)’> 

(2 / x 11 

“*,1 — <i ~ [^1 + (1 + •('0)^2] ~ ( 4 >i + (1 + •!<())« 1 ]h -1 

- [<t >2 + (\ + 4'(l)«2]'r-2 - • . • - (4>^ + (1 + <J>0 )Olp]i,-p 

~ (>ki + It " 4>i + (1 + 'I'oXPi + 7i _ i)Jir,_ 1 - . . . 

- [«!»/> + Cp - 4>/» + (1 + *l»o)(P/> + tp - «p)]w ( 

u%, = 17 , - k 2 - a,!,., ~ a 2 !,-2 - ... - Upi,-p 

~ (Pi + 7 i - “ 1 K -1 “ • • • “ (P;, + 7f> ~ a p)^t p- 

r - a '\ + ( [ + '|t(l )^2 (• + <l»o )^2 

L*() - 

. (1 + t|»oM ffS 

<», + *«* « ; + A 

pi * k a h + pl)p 

h,; 0 

H r , H r 

H,,/ H„ 

0 0 H„ + (Wo?) J 

The special structure of 2 allows rapid circulation of the Cholesky factoriza¬ 
tion £ = LL'. h The lower triangular band structure of L in turn allows the 
system of equations Lx = u to be quickly solved fot x by straightforward 


0 

0 

0 


H/= X 


X = 

CIT*’IT) 


Ho + (Go/ff?) H, 

H, H„ + [CM) 

H, H, 


8 In my empirical work this was implemented wuh the fortran program lijdafb in 
the International Math and Science Library Other algorithms for this problem in 
which the covariance-generating function is known but a specific moving average fac¬ 
torization is not are provided by Hansen and Sargent (1981), Burmeister and Wall 
(1982), and Watson and Engle (1983). 
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elimination, from which u'2” ‘u = x'x. Det(2) is also a natural by-product of 
the C.holesky algorithm. Note that the procedure above exploits the sym¬ 
metric band structure and known zeroes of 2 to evaluate u'2' ’u in vastlv less 
time than direct inversion of 2 would require. 

The likelihood (A3) was accordingly maximized iteratively as follows. First, 
an initial guess was made for the parameter vector q, and (A3) was evaluated 
as described above. I’he gradient was estimated numerically by repeating this 
evaluation for small perturbations in each element ol q. A Davidon-FIcic her- 
Powell variable-weighting scheme was then used successively to improve the 
estimate of q. An estimate of the asymptotic vatiance of q is afforded by the 
convergent value ol the weighting matrix in the Davidon-Fletcher-l’owell 
algorithm. 
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Open Market Operations in an Overlapping 
Generations Model 


Douglas G. Waldo 

L’nivnuty of bloruia 


Pievious overlapping generations models implied elicits lor open 
market opeiations tliat conflict dramatically with the observed posi¬ 
tive correlation between money and the price level and have been 
criticized for failing to model the medium of exchange role of 
money. 1 he purposes of this paper are to develop an overlapping 
generations model that gives money a medium of exchange role and 
that is consistent with the evidence and to point out the crucial role 
of the assumptions made in earlier models about the interaction of 
monetary and fiscal policy and about die supply-side effects of open 
market opeiations. 


I. Introduction 

Previous work with overlapping generations models (Bryant and Wal¬ 
lace 1971), 1980; Sargent and Wallace 1981; Wallace 1981) has pro¬ 
duced predictions that open market purchases will either decrease the 
pric e level and/or inflation rate or leave them unchanged. 1 The incon¬ 
sistency of such predictions with the observed positive correlation 
between money and prices has led many to criticize these models. 


I would like to thank an anonymous refeiee. Bill Bomberger, David Denslow, Dale 
Henderson, David Papell, and Mark Rush for hclplul comments 

' The single overlapping generations model where open market opeiations have 
expansionary clfects is that of Brvant (1980). His paper differs from this one in that 
money does not have a medium of exchange role (e g.. his model is consistent with 
equilibria where the nominal inletest on bonds is negative yet both bonds and money 
arc held), and crut lal differences between his model and previous overlapping genera¬ 
tions models in regard to the interaction of fiscal and monetary policy are not pointed 
out. 


[ Journal of Poll (uni Lcoriomy. I9H"), vol 93, no bj 

t> J9H5 |, v | he L'njvrrtjit of f h ilj^o All right* reserved W>i?2-3808/85/ ( )3(H»-00(>9$i)l.50 


1242 



OPEN MARKET OPERATIONS 


'243 

focusing on the absence of a medium of exchange role for money. 2 
The first purpose of this paper is to develop an overlapping genera¬ 
tions model that gives money a medium of exchange role and that is 
consistent with the evidence. 

Section II of this paper develops an overlapping generations model 
where money has a medium of exchange role and uses it to analyze 
open market operations. First, theories of the demand for currency 
and government bonds are developed. In the model households hold 
both currency and interest-bearing demand deposits, which are titles 
to interest-bearing government bonds. Ownership of currency, a 
bearer asset, is transferred costlessly whereas ownership of demand 
deposits, a nonbearer asset, is transferred with a lump-sum check¬ 
writing fee that is necessary to compensate the bank for recording the 
transfer. This framework implies that households will hold currency 
in anticipation of small purchases and demand deposits in anticipa¬ 
tion of large purchases. Next the model is used to analyze a once and 
for all open market purchase that has expansionary implications a la 
Metzler (1951), lowering the nominal interest rale and raising the 
price level. The nominal interest rate falls to induce households to 
switch from demand deposits to currency. This fall in the nominal 
interest rate induces an excess demand for goods, and the price level 
must rise to decrease the real value of financial wealth. These results 
differ from those of the infinite horizon models of Grossman and 
Weiss (1983), Barro (1984), and Rotemberg (1984) in that the real 
and nominal interest rates decline with a once and for all open market 
purchase and the price level rises less than proportionately, while in 
infinite horizon models the real interest rate is unchanged and the 
price level rises proportionately. 

In addition to the role of money this model differs from earlier 
overlapping generations models on assumptions about the interaction 
of fiscal and monetary policy and about the supply-side effects of 
monetary policy. The second purpose of this paper is to point out the 
crucial role of these assumptions and show that even when money has 
a medium of exchange role either of these earlier assumptions implies 
contractionary effects for open market purchases. Section III devel¬ 
ops the implications of these earlier assumptions. 

Earlier models such as Bryant and Wallace (1979) and Sargent and 
Wallace (1981) assumed constant fiscal policy in the sense that both 
government spending and the narrowly defined deficit are held con¬ 
stant. 3 Under this assumption lump-sum taxes are held constant and 

' l See McCall u m (1983ft) and the comments of the discussants m kareken and Wallace 

(1980) * , . 

''The definition of the deficit has proven io be important in other contexts also 
McC.illum ( 1984 ) uses an infinite fioroon model to consider the monetarist pioposinon 
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the inflation tax must vary to cover marginal variations in interest 
payments. Since an open market purchase lowers interest payments, 
inflation tax revenues must fall. Hence an open market purchase 
implies a lower inflation rate and is contractionary. This result is 
similar to that of Blinder and Solow (1973), who incorporated a gov¬ 
ernment budget constraint in a fixed-price Keynesian model and 
showed that an open market purchase results in lower output. In 
contrast, the model developed in Section II holds fiscal policy con¬ 
stant in the sense that both government spending and the broadly 
defined government deficit (which includes interest payments and 
inflation tax revenues) are fixed. Thus the effects of monetary experi¬ 
ments on interest payments and inflation tax revenues are offset by 
variations in lump-sum taxes. 

Another pair of earlier models (Bryant and Wallace 1979, 1980) 
assumed that the dominant effect of an open market purchase is on 
the supply side of the economy. With an open market purchase and a 
shift by the public from bonds to currency, fewer real resources are 
needed to intermediate the bonds, and the resulting increase in the 
supply of goods drives down the price level. In contrast, in the model 
developed in Section II it is assumed that negligible resources are 
released from the intermediary industry when there is a marginal 
shift from bonds to currency. 


II. A Model Where Open Market Purchases 
Are Expansionary 

A. The Individual 

The model is an overlapping generations model with a fixed popula¬ 
tion. During the first period of life individuals receive an endowment, 
y, part of which is saved and part of which is consumed. Savings are 
split between bank deposits and currency (s = b + c). Bank deposits 
are claims against government bonds and pay interest at the nominal 
rate R. Currency pays no interest. The ratio of the price level in the 
first period of life to the price level in the second period of life is given 
by n, and this is known in advance. Hence the actual and expected 
real rate of return on a deposit is (1 -f /?)fl - 1 and the actual and 
expected real rate of return on currency is n - 1. In addition, cur¬ 
rency is a bearer asset with no transactions costs involved when it is 
spent, while a bank deposit is a nonbearer asset involving a lump-sum 
cost every time a portion of it is spent. Nonbearer assets involve a 


that bond-financed deficits have no effect on inflation. He finds that this proposition 
depends crucially on including interest payments in the deficit. The issue of including 
the inflation tax in the deficit is not addressed since McCalluin studies equilibria with 
zero inflation rates. 
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lump-sum transaction cost since any transfer of these assets must he 
recorded by a third party. In this case the transfer must be recorded 
by the bank, which charges a fee of 0 for every check written on the 
deposit. 

During the second period of life individuals use their savings to 
make one large purchase of variable size and a variable number of 
small purchases of fixed (maximum) sue. For example, one might 
think of a period as lasting several days. At the start of the period the 
individual purchases housing services by renting one house of vari¬ 
able size or quality, and throughout the period the individual pur¬ 
chases food a variable number of times from vendors. It is assumed 
that the young can costlessly convert their endowment into goods sold 
in large and small amounts; hence the relative price of these goods is 
fixed at one. It is also assumed that the large purchase is large enough 
(each small purchase is small enough) that it is always made with a 
check (currency) since the interest more than covers (does not cover) 
the check-writing fee. Finally, net rax payments made by check (with 
currency) are denoted by t a (t,), and it is assumed that the individual 
always makes some tax payment by check. Under these assumptions 
the real value of small purchases is given by I~Ir - t, and the re c) value 
of large purchases is given by (1 + /f)fl(v - <-) - r,, - 20, which is the 
real value of interest plus principal on bank deposits less lump-sum 
taxes and charges tor writing a check tor large purchases and a check 
for taxes. The individual maximizes. 

U(y - t) + V'[(l + K)U(.v - r) - t*] + H'(IIr - x f ) (1) 

with respect to v and c, where V, V , W" > 0 and U", V". W" < 0. Note 
that the transactions costs, 20, ate suppressed here as they will be 
throughout the remainder of the paper because they do not vary. 

Maximizing (1) with respect to v and r results in the following first- 
order conditions; 

-U'(y - 0 + (1 + «)nV'[(l + /f)II(t - <) ' T*] = 0, 

-(l + fl)V'[(l + K)ll(.v - r) - Tftl + H"(Ilr - t.) = 0, 

which set marginal rates of substitution equal to marginal returns. 
Throughout the rest of the paper (2) will he used to derive the de¬ 
mand functions for currency and savings. Substitution effects vs ill 
always he assumed to dominate income effects, and inflation rate 
elasticities will be assumed to be less than one. 

Before we proceed it is usef ul to consider the two issues of whether 
currency serves as a medium of exchange in this model and whethei 
this model is subject to the usual criticisms of putting mono in the 
utility function. 

Some overlapping generations models have been criticized (or fail- 



JOURNAL OF POLITICAL KCONOMY 


i a 4 6 

ing to give money a role as a medium of exchange, and the question 
arises. Does money serve as a medium of exchange in this model? 
According to one critic (McCallum 1983a, p. 192), the simplest “way 
of proceeding is to compare the rate of return of the asset in question 
with those of other assets. If in the model economy this asset serves as 
the medium of exchange, and the others do not, then it will command 
a lower pecuniary rate of return in equilibrium because of the trans- 
actions-facilitating services that it provides its holders.” Since in my 
model currency pays a lower return (before transactions costs) than 
bonds, it seems clear that currency is a medium of exchange. 

The idea of public and private payments mechanisms each special¬ 
ized for certain types of transactions is similar to that of Lucas and 
Stokey (1983), though their private payment mechanism was a credit 
system as opposed to a checking system. As pointed out by Lucas and 
Stokey. ultimately money does appear in the utility function. The 
question arises, What have we gained vis-a-vis directly including 
money in the utility function? The primary criticism of putting money 
in the utility function (see, e.g., the introduction to Kareken and 
Wallace [ 1980J) seems to be that it imolves an implicit or hidden role 
for money that makes it very difficult to address a variety of questions, 
for example, the question of when gold would replace currency as a 
medium of exchange, but the framework presented here could easily 
be extended to allow the discussion of such questions. One would 
need to consider both differential rates of return on the two assets 
and differential lump-sum tiansactions costs. For example, if the in¬ 
flation rate is high enough (because of a high monetary growth rate) 
and the rate of change in the relative price of gold, the cost of weigh¬ 
ing gold, and the cost of assaying gold are all low enough, then gold 
will dominate currency tor making transactions of any given size. 


/i. Equilibrium Conditions 

In addition to satisfying (2), any equilibrium must satisfy the govern¬ 
ment budget constraint. This constraint implies that any excess of 
government spending and nominal interest payments over explicit 
taxes is financed by issuing new debt and is given by 


g + RUb — t,, - t, 


Ci + B/ (., _ i B,- t 

R, 


(3) 


where g is per capita 1 government spending, C, and B, are per capita 
nominal currency and bonds, and P, is the price level. Throughout 


1 Heic and in what follows "per capita” rclers to quantities per ntembet ol the 
relevant generation, not per person alive 



OPEN MARKET OPERATIONS 


>M7 

the paper government spending will be held constant and the analysis 
will be restricted to steady-state equilibria. The assumption of a 
steady-state equilibrium implies that the nominal stocks of bonds and 
currency are growing at the same rate as the price level, 

IIP, = P,_ „ nc, = C,_ ,, n/I, = fi,_(4) 

Substituting (4) into (3) yields the appropriate version of the govern¬ 
ment budget constraint: 

g + HUb - T fc - T, - (1 - 1I)(6 + f) = 0, (5) 


which states that government spending plus interest payments must 
be financed by lump-sum and inflation taxes. The left-hand side of (5) 
is also the broadest measure of the real deficit, J which adjusts for both 
interest payments and the revenues from the inflation lax. This mea¬ 
sure of the real deficit equals the change in the real value of govern¬ 
ment debt, and in a steady-state equilibrium it must be constant at 
zero. Monetary experiments potentially affect this budget constraint 
and the broadly defined deficit through changes in interest payments 
or changes in inflation tax revenues. In this section these effects are 
offset through changes in lump-sum taxes. One analytical advantage 
of this assumption is that the two traditional monetary policy vari¬ 
ables, the debt mix and the inflation rate, are independent. 

In order to hold the real deficit constant, lump-sum taxes vary 
according to 


t= g + RUb - p b, 
T, = -pc, 


( 6 ) 


where p is the inflation rate (1 — H). d hat is, the old receive a ttansfer 
payment in cash that covers the inflation tax on their currency, and 
their gross tax liabilities cover government spending and interest pay¬ 
ments on bonds less the inflation tax on bonds. 6 An alternative would 
be to set the cash transfer equal to zero and subtract the inflation lax 
on currency from the gross tax liability. I he reason this was not done 
is that it induces rather bizarre income effects in response to inflation. 
Under this assumption the real value of the currency holdings of the 
old is (1 - p)c and the real value of their demand deposits net of tax 

liabilities is 

(1 + R)Ub - Tfr = b + pc - g- (7) 


' This i#-the drhnilitH) advocated, e g., 
" Government spending is assumed to 
always paid by check 


t>v Siegel 11979). 

he large enough dial these lax liabilities are 
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Thus as the inflation rate rises, the real value of currency falls while 
the real after-tax value of demand deposits rises, and there is a ten¬ 
dency for households to anticipate this and shift from demand depos¬ 
its to currency. In order to avoid this kind of income effect the cash 
transfer is assumed. This assumption makes the real values of the 
currency and demand deposits of the old, after taxes and transfers, c 
and b — g, respectively. Finally note that under (6) the government 
budget constraint is met for any inflation rate, interest rate, and mix 
of debt. 

The augmented savings and currency demand schedules are de¬ 
rived from (2) after substituting (6) for lump-sum transfers and s — c 
for b- 1 


- U'(y - s) + (1 + fi)HV'(s - c - g) = 0, 
— (1 +■ R)V'(s — r — g) + W'(c) = 0. 
These schedules are given by 

s = s(ff, fl), r = c( — R, II), 


where 


( 8 ) 

(9) 


_ ds -nv'W' n 

“ 7 °- 

_ da -<l + fi)V"l(l + R)V + W*K ,, 

2 </n y 

_ dc V'U' a 

-C\ = -rjS = - < 

dR 7 

dc -(1 + RfV'V" n 

c, = = —i-> 0, 

c/n 7 


and 


7 = (1 + R)U"V" + U"W" + (1 + R)l\V"W". 


First note that the tax scheme described by (6) completely eliminates 
income effects so that all that remain are substitution effects. As the 
nominal interest rate rises, the real return to demand deposits rises, 
and large purchases are substituted for initial consumption and small 
purchases. Hence savings rise and currency demand falls. As the 
inflation rate falls (and II rises), the real returns to both currency and 
demand deposits rise, and large and small purchases are substituted 
for initial consumption. Hence both savings and currency demand 
rise. 


7 An appendix with derivations is available from the author. 
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Often in ad hoc macro models it is assumed that the demand for 
savings depends only on the real interest rate ([1 + /?].q - fIs y = 0) 
and currency demand depends only on the nominal interest rate (oj 
= 0) (see, e.g., Mundell 1963). These restrictions do not hold in this 
model since large purchases have diminishing marginal utility (V"' < 
0). If V" were zero, then the first first-order condition would deter¬ 
mine savings and the second first-order condition would determine 
currency demand. Given that V" < 0, savings and currency demand 
are jointly determined by both equations. Also, by (9) 


C-> = Jg 


(i + K)fi 

n 


> 0 ; 


( 10 ) 


the response of currency demand to a decrease in the inflation rate 
(with the nominal interest rate held constant) is positive and equal to 
the response of savings to decreases in the inflation rate and the 
nominal interest rate that keep the real interest rate constant. The 
equality in (10) is similar to the usual (microeconomic) result of sym¬ 
metric cross-substitution effects. Think ot (1 + li) and (1 + /?)11 as 
the relative prices of small purchases and initial consumption in terms 
of large purchases. Then n times the response of small purchases to a 
change in the relative price of initial consumption must equal the 
response of initial consumption to a change in the relative price of 
small purchases. The correction of the first cross-substitution effect 
by 11 is necessary because for this problem the Jacobian matrix is not 
symmetric; instead the otf-diagonal elements are (1 + R)F1V'” and (l 
-t- fi)V".* A unit increase in the relative price of initial consumption 
(with the relative price of small purchases held constant) is obtained 
when the inflation rate falls by 1/(1 + li) and the nominal interest rate 
is unchanged; 11 times the effect on small purchases is given by HrT(l 
+- R). A unit increase in the relative pi ice ot small purchases (with the 
relative price of initial consumption held constant) is obtained when 
the nominal interest rate increases by one unit and the inflation rate 
increases by fl/(l + li). The effect on initial consumption is given by 
+ [HW(1 + /?)]. Setting these two effects equal implies that (10) 

holds. 

The equilibrium conditions for this version ol the model are 


C 


0 


To 


c(-K. -P). 


( 11 ) 


Bo + Co 

Pa 


= s{R, ~P). 


8 The alt-diagonal elements arc not equal because lira clem. one a .' ' > |m , 

to the individual budget constraint and then second derivatives ate aU i | 
individual budget constraint augmented bv the tax scheme gnen 
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where B ih 6' 0 , and /»„ are the initial nominal stock of bonds, nominal 
stock of money, and the price level, and 1 - p has been substituted 
for IT with the I suppressed. The first equation equates the real de¬ 
mand for currency and the real supply of currency. The second 
equates real savings and the real supply of financial assets or, equiva¬ 
lently, the excess supply of goods by the young and the reai demand 
for goods by the old and the government. Note that since all equilibria 
are steady states, real supplies and demands are constant across pe¬ 
riods. 


C. Monetary Experiments 

Open market purchases are analyzed using (11). Letting dB n 
the effects on R and P„ arc given by 


dR 

dC t) 


— s 

PiA 


< 0 , 


<W _o 

(K'n 



— dCi), 


( 12 ) 


with A = t|C + c| a. These results are the same as those of Mei/.ler 
(1951). With an open market purchase, the nominal interest rate must 
fall to induce individuals to shift f ront demand deposits to currency. 
This fall in the nominal interest rate induces an excess demand for 
output; hence the price level must rise to decrease the real value of 
government debt and the demand for goods. Accompanying the fall 
in the nominal interest rate and the rise in the price level are shifts in 
the composition of consumption. The young save less and consume 
more since the interest rate is lower. Also since the interest rate is 
lower, the young want more currency and fewer demand deposits, so 
they supply less of their endowment to large purchasers and more to 
small purchasers. The total consumption of the old falls since the 
nominal value of government debt is the same and the price level is 
higher. There is also a shift in consumption by the old front large 
purchases to small purchases that is induced by the fall in the interest 
rate. 

It is of some interest to compare these results with those of infinite 
horizon models (see, e.g., Grossman and Weiss 1983; Barro 1984, p. 
384; Rotemberg 1984). In those models a once and for all open mar¬ 
ket operation leaves the nominal (and real) interest rate unchanged, 
and the price level rises proportionately with the increase in the 
money supply. 

Several other monetary experiments can also be considered. <J First, 


11 In addition to analyzing open market purchases, Metzler (1951) performed a sec- 
ond experiment, increasing the money supply with no offsetting (hanges in the private 
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currency entering as a transfer (a helicopter drop) has the same Zyi 
uattve effects as an open market purchase, though the effects on the 
interest rate and price level are smaller and larger, respectively: 


<IR 

dC u 

dPo 

dCo 



d + <~i 

A 


> 0. 


(13) 


The helicopter drop differs from the open market purchase in that 
there is an initial impact on wealth and an initial rise in the demand 
for goods accompanying the initial excess supply of currency. The 
initial rise in the demand for goods means that a larger increase in the 
price level is necessary to clear the goods market, and the higher price 
level helps eliminate some of the initial excess supply of currency. 
Consequently the nominal interest rate rises by less. Second, a revalu¬ 
ation, where and C n rise proportionately, is neutral, leaving the 
interest rate unchanged and causing the price level to rise propor¬ 
tionately. Third, it is also possible to analyze the effect of an increase 
in the monetary growth rate (the inflation rate). 1 he effects on Po, R, 
and (1 + R)U are given by 

(IPi, _ f»(.V|C2 + c i*a) -> p 

dp A 


dR 

dp 


s >c - r->s 


to, 


(14) 


<11(1 + R) HI 

dp 


MR 

dp 


-d+«) 


- [flca(f ~ -s) + 0 + K) f flj 


< 0. 


Note that the last result employs the restrict,on given in (10). As the 
inflation rate rises, the demand for currency tails and the demand or 
goods rises; hence the price level must rise to deflate the leal supp y 
of currency and cut the demand for goods. Whether the notn a 
interest rate rises or falls depends on the relative mt 
of savings and currency. If the former (latter) is largu. then the 
increase the inflation rate has stronger effects «> n «>vmgs^(turr ^ 

demand) than on currency demand (savings), ienc 

must rise (fall) to increase savings (currency demand). Eien d he 
nommTinterest rate rises, it is overshadowed by the increase in the 


holdings of other assets. However, his l,l ^ r X-Vumc\ S su’pplv nutcase the 

of goods in the future. Given h.s setup. thus Me.zlcrs sec- 

price level and the nominal value o othet assc I P experiment ol this paper 
ond experiment corresponds most doselv to the resaluaut I 
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inflation rate and the real interest rate definitely falls. By way of 
comparison, Mundell (1963) assumed that savings respond only to the 
real interest rate, (1 4- R)s t = ILa, and the inflation elasticity of cur¬ 
rency, c 2 , is zero. Under these assumptions, the higher inflation rate 
definitely raises the nominal interest rate. 

III. Crucial Assumptions 

In addition to the role of money, the model developed in the previous 
section differs from earlier overlapping generations models on as¬ 
sumptions about the interaction of fiscal and monetary policy and 
about the supply-side effects of monetary policy. This section points 
out the crucial role of these assumptions, showing that even when 
money has a medium of exchange role either of these earlier assump¬ 
tions implies contractionary effects for open market purchases. 


A. The Interaction of Fiscal and Monetary Policy 

In Section II fiscal policy is held constant in the sense that both gov¬ 
ernment spending and the broadly defined government deficit (which 
includes interest payments and inflation tax revenues) are held con¬ 
stant. Thus the effects of monetary experiments on interest payments 
and inflation tax revenues are offset by variations in lump-sum taxes. 
However, Bryant and Wallace (1979) and Sargent and Wallace (1981) 
assumed constant fiscal policy in the sense that both government 
spending and the narrowly defined government deficit are held con¬ 
stant. The effects of this assumption can be seen with a simple 
modification of the model given in Section 11. 

The narrowly defined real government deficit is real government 
spending less taxes, and holding this constant implies that lump-sum 
taxes are constant. Additionally, it is assumed that cash transfers are 
zero. Under these assumptions inflation tax revenues must vary to 
offset any change in the real interest on government bonds, for the 
government budget constraint is 

g — t = pj — RUb 

__ P (B„ + C’„) - RUBp (l- r >) 

Po 

Equation (15) in conjunction with (11) determines the nominal inter¬ 
est rate, the initial price level, and the inflation rate. The demand 
functions for this version of the model are derived from (2) with t, = 
0 and T/p, fixed. Though some income effects arise in this version of the 
model, substitution effects are assumed to dominate so that the re¬ 
strictions given in (9) and (10) are still correct. 
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The ef fects of an open market purchase are given by 

dR _ - - p'2 + «n(j 2 - c 2 )] ^ 

dC 0 P n A 

jip_ = ~.v[/fnr, t n b + m - P )i,] 
dCa P oA 

dPa _ 5 1 (v + Rb — RUc 2 ) — .s 2 ri(/ic] + b) 
dC ,0 A 


( 16 ) 


= j t (f - pc_>) - m(M 1 - p)c-> - c 2 lli> - ,v 2 II Ro, = 0, 


where 


dli 0 — c/60, 

A = + s( i — ps-j) + cM>] + S|[Wnr 2 6 + c(c - pr L >) + cRb] 

+ A! f(s 2 c - as) 

- C|(/frifcv 2 4- ,v(.s — p.s 2 ) + ,sRb\ + a 1 [(/■?n - p )c>b +- r(c - pt 2 )J 

+ b\\\bc > + c|(1 + R)s 1 + flco — fb,,]} > 0. 

In addition to earlier assumptions, (16) is based on the assumptions 
that the real interest rate is positive and the inflation elasticities of 
currency demand and savings are less than one: 

MI - p = Ml - (1 - 11) = (1 + Mil - 1 > 0. 

(17) 

A - pl 2 . ( - pc> > 6, 

With an open market purchase, the nominal interest rate must fall to 
induce individuals to shif t out of demand deposits into currency. The 
lower interest rate and the decrease in bonds both cause interest 
payments by the government to fall and put downward pressure on 
the inflation rate. Since both the nominal interest rate and the infla¬ 
tion rate are falling, the effects on savings and the price level are 
ambiguous. If the inflation (interest) effects dominate, savings use 
(fall) and the price level must fall (rise) to dear the goods market. 

The case where an open market purchase can have contractionary 
effects is similar to that of Blinder and Solow (1973), who mcoipo- 
rated a government budget constraint in a fixed-price Keynesian 
model and showed that an open market purchase results in lower 
output. In their model tax revenues were proportional to real output, 
and when an open market purchase lowered interest payments, tax 
revenues had to fall through a decline in real output. 

Before we consider assumptions about the supply-side ellecLs ot 
monetary policy, it is of some interest to ronsidei a case that is inter¬ 
mediate between that of earlier overlapping generations models and 
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mine. Specifically we now consider ihe case where the conventionally 
defined deficit is held constant. The conventional definition of the 
real deficit adjusts for interest payments but not for revenues from 
the inflation tax. When this real deficit is held constant, the monetary 
authority can determine the mix of debt, but the monetary growth 
rate must vary to maintain the government budget constraint. 

The conventionally defined real deficit is given by 

g + /ill/) — T, — T/, = g — T. ( 18 ) 

For this case there is no cash transfer, and lump-sum taxes are decom¬ 
posed into two components: 

T h = + T. (19) 


One component, /fllA, covers the real value of nominal interest pay¬ 
ments and is allowed to vary with open market operations. The sec- 
find remaining component, f, is held constant; 1 " thus the real deficit is 
held constant. Additionally it is assumed that t is less than g so that the 
real deficit must be financed by an inflation lax on government debt. 
This can be seen by substituting (19) into (5) to obtain 


g - f = p(c + b) 

_ p(C.i + «<») 

Po 


( 20 ) 


Equation (20) in conjunction with (11) determines the nominal inter¬ 
est rate, the initial price level, and the inflation rate. The demand 
functions for this case are derived from (2) when (19) is substituted 
for lump-sum taxes. Once again substitution effects are assumed to 
dominate any income effects so that the restrictions given in (9) are 
still correct. In what follows inflation elasticities will be crucial, and 
once again these are assumed to be less than one. 

The elfects of an open market purchase are given by 

dR = -■!(■< ~ AT) < () 

dC n Po A 


dPo 

dCo 



( 21 ) 


dp 

dC 0 



where dBo = ~dC» and A = C](.s — p.s,>) 4- a |(r — pc a ) > 0. With an 
open market purchase the nominal interest rate initially falls to in¬ 
duce individuals to shift into currency. This fall in the nominal inter- 


>u li is assumed that f is always high enough that households choose to pay taxes by 
< hec k. 
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est rate decreases savings and creates an excess demand for goods so 
that the initial price level must rise. As the price level rises the real 
value of the base for the inflation tax falls and the inflation rate must 
rise (o stabilize inflation tax revenues. 

Thus when the conventionally defined real dehcit is held constant, 
the effects of an open market purchase are expansionary though not 
precisely the same as those of conventional analyses. In particular the 
monetary growth rate and the mix of debt are no longer independent 
so that an open market purchase now increases the monetary growth 
rate and the inflation rate. 


B . The Dominance of Supply-Side Effect s 

In Section II it is assumed that negligible resources are released from 
the intermediary industry when there is a marginal shift from bonds 
to currency; however, Bryant and Wallace (1979, 1980) make the 
opposite assumption and further assume that the effects of these 
released resources on the supply of goods dominate any demand-side 
effects of monetary policy. The effects of this assumption can be seen 
with a simple modification of the model given in Section II 

Let a be the constant real resource cost of intermediary bonds 
incurred when the bonds are initially purchased. Under this assump¬ 
tion savings are split between currency, bonds, and initial transactions 
costs (s — r + b 4- ab), the real value of small purchases is once again 
given by lie - t, , and the real value of large purchases is now given by 
|(1 4- /f)ll(s - c)/(l + a)] - - 20. The development of savings and 

currency demand schedules parallels that of Section II, and these 
schedules are qualitatively unchanged. However, the second equilib¬ 
rium condition given in (11) must be modified so that the real supply 
of financial assets is equated to savings net of initial transactions costs, 
or, equivalently, the excess supply of goods by the young net of initial 
transactions costs must equal the real demand for goods bv the old 
and the government. Then equations (11) become 


<:(-«• -p) 



ftp 4- (,'q 

/V 


s(R. 




The effects of an open market purchase are given by 


dR _ -(14- ct)> < y 

dC a !\A ' (23) 


dPp 

dC 0 


i\ - otr. 


i o. 


A 
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where dB a = —dC 0 and A = S|C + 0(5 + a b) > 0. I 11 addition to the 
“demand-side” effects of an open market purchase, which cause the 
interest rate to fall and the price level to rise, such a purchase reduces 
the amount of bonds outstanding, thereby reducing transactions costs 
and increasing the resources available for consumption. The resulting 
excess supply of goods puts downward pressure on both the interest 
rate and the price level. If this “supply-side” effect is strong enough it 
will more than offset the rise in the price level due to demand-side 
effects. 


IV. Conclusion 

This paper developed an overlapping generations model of open 
market operations and showed that under certain assumptions such a 
model is consistent with the results of Metzler (1951) that open mar¬ 
ket purchases are expansionary. Crucial assumptions are that cur¬ 
rency has a medium of exchange role, that fiscal policy is held con¬ 
stant in the sense that both government spending and the broadly 
defined deficit are held constant, and that the demand-side effects of 
an open market operation dominate the supply-side effects. Though 
the model does predict that open market purchases have expansion¬ 
ary effects, it is not entirely consistent with popular views, for it 
predicts that such purchases can permanently lower the steady-state 
real interest rate and raise the price level less than proportionately. 
The same wealth effects that give rise to a dependence of the steady- 
state real interest rate and price level on fiscal policy in overlapping 
generations models also give rise to a dependence of the steady-state 
real interest rate on monetary policy and less than proportionate 
changes in the price level in conjunction with open market opera¬ 
tions. 11 
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A Model of Declining Health and Retirement 


John R. Wolfe 

M u hi frail Staff l rui'i'iMly 


The declining U.S. average retirement age has been dilhcult to rec- 
oncile with c ross-scclion evidence that poor health is a primary cause 
of early retirement. This paper reconciles those observations by 
building a Grossman-type model ol investment in health, distin¬ 
guished from its predecessor by a mote formal treatment oi initial 
(otuhtions. A discontinuous inid-lile increase in health investment 
(retirement) is shown, in general, to be part of the optimal invest¬ 
ment strategy. Time-series productivity increases are shown to lower 
the optimal retirement age, while varying health endowments and 
depreciation rates are shown to be able to account tor cross-section 
retirement age variations. 


Introduction 

Why do some American workers retire earlier than others? In at¬ 
tempting to answer this question, one must address these apparently 
inconsistent observations: (1) The average retirement age in the 
United States has fallen dramatically since I960. 1 (2) Self-reported 
poor health is almost always found to be one of the most important 
explanatory variables in studies of early retirement.' (3) The level of 


I wish to ihank John Goddecris. Mu had Giossman, Daniel S. Mamermesh, fames J. 
H v< kinan. and colleagues al Michigan Stale University fen valuable comments and sug¬ 
gestions 

1 One important indicator is (he steady trend toward earlier teceipt of Sot ial Security 
icured worker benefits. In 1961, 7.2 percent of new awards lor men and 25 3 percent 
of new awards tor women were received at age 62, the youngest allowable age. By 1980 
these proportions had risen to 30.1 percent and 45.9 percent, respectively (see U.S 
Department of Health and Human Service's 1982. p 102). 

J See the studies cilecl in Clark, Kreps, and Spenglcr (1978, pp, 935-36) 
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general health among older Americans appears to be holding steady 


or improving/' and mortality continues to improve steadily. 

Attempts to reconcile these observations have generally argued that 
they are not all true. Some are skeptical about self-reported reasons 
for retirement, arguing that workers actually respond to incentives 
built into retirement programs but find health a more socially accept¬ 
able explanation (see, e.g., Campbell and Campbell 1976). Others 
question the accuracy of health measures or argue that mortality im¬ 
provements need not be accompanied by morbidity improvements 
(see Wilson 1981; Feldman 1982). 

None of this skepticism is necessary, however: the observations can 
be reconciled. In this paper 1 present a model in which investment in 
health is the only reason to cut back on work and in which rising 
productivity can lead both to better health and to earlier retirement. 
I'his is because greater productivity can be shown to bring higher 
aspirations for retirement health: the greater the health that one 
intends to maintain in later years, the earlier one must begin 
significant investments of time in counteracting the depreciation of 
health. That this investment may optimally begin in nud-life, and at 
some nonzero level, is what distinguishes this model from Grossman’s 
(1972«, 19726) well-known model of investment in health. 

The modeling approach taken in this paper can in fact be viewed as 
an effort to add a new element to Grossman’s model bv solving a 
problem that was analytically convenient for him to avoid. His work 
establishes marginal conditions for investment in health: the rate ot 
return must be greater than or equal to the user cost ol health capital. 
While his model is able to explain aging and death as the results ot 
rational economic decision making, nothing in his formulation leads 
naturally to the kind of abrupt increase in time investments m healtli 

that we associate with retirement. 

[ust such an abrupt increase may obtain, however, it initial condi¬ 
tions are treated rigorously. Consider the initial situation: given ihe 
depreciation rate of healih, one can calculate the initial user cos. ot 
health capital. Because health is subject to diminishing marginal re¬ 
turns, there is a unique level of health (call it *o*) whose rate ol return 
would exactly equal this user cost. There is no reason to expect 
endowment of health to exactly equal /.,?• Grossman assumes any ex- 


’ For example, three measures of assoing own health as 

pet cepuble trend m annual data tram Sumbei of bed 

fair or poor, average number of re.inc.ed .« ' Human Senues [ ,079- 
disability dims. (Data are from L. . epat 1 ' t | segment ol the population 

8S! > This IS so in spite of the r.se tn the average age ol thn segn 

because of mortality improvements. 
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cess of the initial stock over h* to be instantaneously disposed of. 4 
Greater lifetime welfare could be achieved, however, by delaying in¬ 
vestment in health, allowing health to depreciate until its rate of re¬ 
turn would rise to equal at least user cost. The investment of time in 
health maintenance that becomes desirable at this point could be in¬ 
terpreted as retirement. Before retirement, health investment would 
not conform to Grossman’s marginal conditions but would instead be 
bound by the constraint that gross investment be nonnegative. One 
characteristic of this optimal investment strategy is especially sugges¬ 
tive of retirement: investment does not increase gradually from 2 ero 
but instead rises discontinuously in mid-life from zero to a positive 
amount. 

The next section presents a simplified version of Grossman's pro¬ 
duction model. An initial surplus of health will be assumed and 
justified on these grounds: that the human species, with its goal of 
self-preservation, confronts a different problem than the individual 
who seeks to maximize utility. The evolutionary solution to the for¬ 
mer may entail an excessive health endowment in the sense that an 
individual might prefer to have less health and to be compensated 
with wealth in a more liquid form. ’ 

Model 

Individuals seek to maximize lifetime utility, which depends on a 
consumption stream c(t). Consumption is only one use for income. 
Competing uses are savings and investments in health. Income in 
turn depends on both types of assets: healthier individuals are more 
productive, while financial assets earn interest. We will make no dis¬ 
tinction between market work and household production. Each pro¬ 
vides identical “income,” which can be devoted to consumption or 
investment in health. The problem, then, is to maximize 

f , •*£/[,(/)]*, (1) 

Jo 

subject to 

h = —hh+ /(h) + ra - r - j 2 - bh, (2) 

a = s, (3) 

and 

* This limitation has also been noted by Muurinen (1982) 

5 The greatest concern of the species may be the prevention ot death due to acute 
illness before reproduction. This may entail a high initial level of health. See Becker 
(1981, chaps. 5, 9) for a discussion of parents’ investment in children in the context oi 
natural selection. 
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h{ 0) = hi 1 , a(0) = a () , a(7’) = 0, (4) 

where t/[c(01 = utility of consumption, T - life span, p = rate of time 
preference, h(t) = health stock, 8 (t) = depreciation rate of health, r = 
rate of interest on financial assets, «(/) = asset stock, and ,\(I) = savings. 
Assume also that 


U' > 0 , U" < 0 , ( 5 ) 

and 

/'>(),/"< 0. (fi) 

Health and assets provide incomes/(/t) and ra, respectively, and, as 
in (2), this income can be devoted to consumption, savings, or gross 
investment in health. 1 ’ Health can never deteriorate more rapidly than 
its depreciation rate. In a departure from Chessman's work, which 
greatly simplifies the solution, life span T will be treated as fixed. 

Solution 

This is an optimal control problem with control variables r and 1 and 
stocks h and a. Because of the inequality constraint in (2), however, it 
is convenient to replace r with a new control variable, q s c + s - ra. 
When the constraint is binding, we know from (2) that q = ((h), so that 
all income is devoted to consumption and savings and none to invest¬ 
ment in health. 

The Hamiltonian for this problem is 

II = e“ f *'{C7(c) + \,,\f(h) - q - 8/t] + h„(q ~ f + '«)}■ 0) 

Differentiating with respect to c, q, h, and a, we obtain 

— = , <"|tr (f ) - XJ. (8) 

tic 

— = c "'(-X/, + K). <*•>) 

dq 

x* = x*[p + s-ym < 10 > 

and 

X„ = \„fp - r). 7 <>*> 


" Note that health docs not enter the utility turn non 
the less because it increases lifetime consumption possibilities 
7 Conditions (10) and (II) follow from scitm* time derntimes 
equal to - dH/dk and -dHIdo. lespeelively 


(Is hut ts desirable new'i- 


ol r ‘"K* and ' '“K, 
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Cons umption 

Equation (II) tells us that A„, the shadow price of assets, either grows 
or declines continuously over the lifetime, depending on whether p is 
greater than or less than r. From (8), we set U'(c) - k„\ because U" < 
0, c will decline over time if p > r and rise if p < r. Greater a 0 implies 
greater r<), to exhaust assets at T. 


Investment in Health 

Equation (9) tells us that if k„ is greater than (less than) k /n we should 
set q equal to its maximum (minimum) value. In the former case, q = 
/(h): the shadow price of health is so low that all income is devoted to 
savings and consumption. 8 

Consider finally a time path along which k h - k„. In this case, k h = 
and (10) and (11) tell us that 

/'(h) = r i- 8. (12) 

fins is Grossman’s condition that the return to health capital must 
equal its user cost. Under these circumstances, q is no longer deter¬ 
mined bv (9), which always vanishes. Instead, equation (12), by de¬ 
scribing a time path of h, which we may call //*(/), implicitly prescribes 
investment m health. 

What remains is to demonstrate that the utility-maximizing solution 
eventually requires k t , = k„ and h(t) = h*(t). regardless of initial values 
of A„ and A./,. This can be seen with the aid of figure 1. When k„ § 
we set q = '/.!»]. in which case h = \ I bis establishes the horizon¬ 
tal arrows. When h g h*, then f'(h) 5 /'(/i*), so that f'(h) § r + 8. By 
(10) and (1 1), this tells us that k/Jk/, § A„/A„, which establishes the 
vertical arrow's. The only path to a stable equilibrium is therefore the 
one indicated in the figure. 

If we allow b'(t) > 0, then h*(t) is no longer a constant but must 
decline ovei time, as Grossman found, because of (12) and the fact 
that f”(h) < 0. Ehe additional implication of this model is that invest¬ 
ment in health is the solution to a bang-bang control problem: if A/, < 
k„ initially, the optimal time path of health is as pictured in figure 2, 
declining at rate 8 until some time / 1 , when the marginal return to 
investment in health has risen to equal user cost. At t\ gross invest- 

H i lie hitler case is cit less interest: there is no minimum constraint oil c/ = r + c — in, 
so that any amount ol dissaving or borrowing may be undertaken to bring h up to its 
desired level 

11 All other paths diverge in the limit toward h = 0 01 li = =° These aie minima, not 
maxima. A steady state where li = 0 would provide the smallest possible lifetime 
income As h approaches sc, the user cost of health outweighs its marginal product by an 
increasingly large amount 
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Fig. 3 


ment in health switches from zero to a positive amount. A possible 
time path is shown in figure 3. 

The individual’s situation between times 0 and t\ can be described 
this way. Health is so abundant and its marginal product so low that it 
would be desirable to exchange some health for financial assets in 
order to finance higher lifetime consumption. The constraint in (2) 
makes this impossible. Seemingly irrational real-life behavior, such as 
working “too hard’’ at the expense of one’s health or consuming an 
unhealthy diet, can be viewed as rational efforts to exchange health 
for consumption when X;, < X„. To the extent that such exchanges are 
limited, however, youth is wasted on the young. 

Beyond retirement age 1 a large portion of potential income f(k) is 
devoted to maintaining health. Consumption is continuous at l 1 , yet 
gross investment is discontinuous. Saving must therefore also be dis¬ 
continuous, falling at t t . In addition, asset exhaustion requires that s 
eventually be negative. The model therefore generates a life-cycle 
pattern of saving and dissaving. 10 


Optimal Retirement Age 

Because there is no investment in health until age t\, h(t\) must satisfy 
h(t) = A„e~ / .' S<T, '' T , t E [0, /,]. (13) 

Ageti is also the initial age when X* = X„, so h(t\) must also satisfy (12). 


10 Empirically, retirement is normally a reduction in market work alone. A possible 
extension, following Gronau (1977), would be to regard time devoted to household 
production as subject to diminishing marginal returns, so that one devotes time to 
household production only to the point where the value of its marginal product equals 
the market wage, any additional available time being spent at market employment. 
Thus at Z 1 . when more time is devoted to investment in health and less to production of 
consumption commodities and storable assets, time would first be withdrawn from 
market employment. 
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rfv^ ,s(TWT = »■ 

which implicitly defines t , as a function of h (h b(t ), and r. Note that t, 
does not depend on p, the rate of time preference. While the solution 
for t, depends on specific functional forms chosen for ] and S, we can, 
without loss of generality, sign the derivatives of /, with respa t to 
other parameters by totally differentiating (H). 

Before doing so, let f{h) = wg(h) and b(l) = a y(t). where ui and a are 
scalar productivity and depreciation indices, respectively. Totally dif¬ 
ferentiating (14), we obtain 

[ — <xyu>fr"(h)h - a-yjd/, = dr -t- y + wg"(h)h ^ y(T)<h da 

~ wg"(h)hd In /to - g'(h)du\ 

where all functions are evaluated at fi. 

Because y and g" are negative and g' is positive, the signs of 
coefficients of dt,, dr. d In /i 0 , and dw are evident: greater initial health, 
a higher interest rate, or lower productivity cause fi to be chosen later. 
In figure 2, greater h n raises the entire preretirement path of health, 
while greater r or lower productivity lowers the desired path of post¬ 
retirement health by (12). Note that the inverse relation between pro¬ 
ductivity and retirement age requires variation in productivity at post- 
retirement tasks such as recreation and health maintenance. 

The effect of a on t\ is ambiguous: greater a implies a more rapid 
decline in preretirement health li(l) but also lower desired postretire¬ 
ment health h*(t) because higher depreciation increases the user cost 
of health. It is evident from equation (l. r >) that the tendency to cause 
earlier retirement is increasingly likely to dominate as productivity ir 
increases. 

The retirement age t\ does not depend on initial assets because a 
change in assets leaves equation (14) unaffected, theater initial health 
ho implies not only later retirement but also greater consumption 
because a shorter period of investment ttt health leaves more wealth 
available to consume. 

Discussion 

The model demonstrates that it is possible for the average teiirement 
age to fall at the same time that postretiremen! health improves. Both 
may be the result of rising time productivity. 1 hese trends have also 
been shown to be consistent with the frequent survey tespouse ol 
declining health as the reason for early retirement. In this model, 
maintaining one’s health is the only motive tor teiirement. Finally, the 
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earliest retirees may also report the poorest health: they may be the 
individuals with the most rapid depreciation rates of health. 

The model predicts that investments in health may be avoided al¬ 
together until mid-life, yet some of the greatest expenditures of time 
and money on health are in fact made very early in life, even before 
birth. What has been omitted from (he model is acute illness: no 
random threats to health have been considered. 11 Instead, the con¬ 
cept of health embodied in this model is that of general vitality. In¬ 
vestments are undertaken because they slow the advance of chronic 
conditions associated with aging. These investments alone are de¬ 
picted in figure 3. 

The omission of acute illness and assumption of a fixed terminal 
age make this model a limiting case of the model of health proposed 
by Fries (1980). In his view, U.S. mortality due to acute illness is being 
gradually eradicated, and the average human life span is converging 
to an upper limit. Future health improvements will consist increas¬ 
ingly of reductions in chronic illness, permitting more vigorous activ¬ 
ity within a biologically fixed life span. 

The model has implications for recent efforts to reverse the trend 
toward earlier retirement in the United Slates. One of the key provi¬ 
sions of the social security legislation adopted in 1983 is a gradual, 2- 
year increase in the age of eligibility for full retirement benefits, to be 
fully effective in 2027. Proponents of the change have cited im¬ 
provements in mortality (see, e.g.. National Commission on Social 
Security Reform 1983, chap. 4, supp. statement 1), while opponents 
have argued tfiat the health of workers in their sixties may not im¬ 
prove sufficiently to add 2 productive years to the typical worker’s 
career (supp. statement 2). 

I he analysis here suggests that health among older persons has 
been improving largely because of earlier retirement: workers are 
taking greater advantage of a potentially more rewarding retirement 
by beginning it earlier. If the change in social security rules is success¬ 
ful in inducing workers to delay retirement by 2 years, recent declines 
in chronic illness among older Americans may be slowed. However, it 
is likely to be difficult to convince workers to delay retirement. As we 
continue to find more ways to use retirement time productively, we 
can expect successive generations of workers to seek even better re¬ 
tirement health, a goal tfiat requires that retirement begin earlier, not 
later. 

11 If we suppose that health may reduce the likelihood that acute illness will result in 
death, then mortality data can provide further suppoit for the model. As evidence ol 
the role of productivity, Wcatherby, Nam, and Isaac (1983) hnd the mortality advan¬ 
tage enjoyed by more developed countries to fie especially great at posireiircment ages. 
Wolfe (1983) lound early retirement to be associated with higher mortality, providing 
evidence that retirement depends on cross-sectional variation ;n depreciation. 
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Miscellany 


The Case of Erudite Economists 


I. Introduction 

Compared with other social scientists, economists are less inclined to 
refer to, much less to quote directly, the work of others. Such reti¬ 
cence encompasses both the wtitings of other economists and general 
literary products. There have been a few cases where economists have 
quoted or referred to a literary work, most notably Alice in Wonder¬ 
land, but these are exceptions. However, the saga of the great detec¬ 
tive Sherlock Holmes, as recorded by John H. Watson, M l)., has 
attracted some attention from prominent economists. 

In three cases wherein economists have alluded to the Sacred Writ¬ 
ings (as the saga is known to devout Holmesians), 1 it is obvious that 
either the economists in question have not completely read the record 
of the case or they were careless in reporting their information and, 
therefore, gave to their fellow economists an erroneous impression of 
Holmes. 

It is the aim of the present paper to set the record straight by 
showing where the economists have deviated from the record. 


II. The Case of the Three Economists 

It is fitting the start with the reference made to the greatest detective 
of all time by the greatest living economist. There is conclusive evi¬ 
dence that we have here a case of the spoofing detective and the 
unsuspecting economist. Paul Samuelson begins the introduction to 
the fifth edition of his Economics thus: “Doctor Watson has told us how 
very astonished he was to learn that the great Sherlock Holmes had 


1 In another instance, von Neumann anil Morgenstern (1947, pp. 17(1-78) analyzed, 
from a game-theoretic perspective, the decisions made by the two adversaries as Mor- 
larty pursued Holmes from London to Canterbury in "The Final Problem ” I am 
indebted to George Stigler for pointing this out to me 
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never heard of the solar system” (Samuelson 1961. p. 3). He goes on 
to note that the doctor was even more surprised to learn that llohnes 
shunned any such knowledge on the grounds that it would not leave- 
room in his mind for more important subjects. 

It is clear that Samuelson does not direr tly attribute such ignorance 
to Holmes and is merely recording Watson’s feelings. The damage, 
nevertheless, is done, and an impression is left that perhaps Holmes 
was a dunce. The facts of the case are very different. Watson’s state¬ 
ment, which is the source of Samuelson’s reference, occurs in A Study 
in Scarlet, the first Holmes adventure to he recorded by Watson.* In 
the course of chronicling this adventure, Watson describes fits new 
friend and makes a list showing the extent of Holmes's knowledge on 
different subjects (Hoyle 1967, 1:156). However, there is ample evi¬ 
dence that in the early days of their association Holmes was pulling 
Watson’s leg. Moreover, it is very improbable that, in 1X81, a man 
educated for 5 years in the best universities of England would be 
totally ignorant of the solar system. As Howard C.olltns (1911) has 
pointed out, Holmes knew- about the planetary system, since on one- 
occasion (“ I'he Adventure of the Bruce-Partington Plans' ) he said 
about his brother, ” ’But that Mycroft should break out in this erratic- 
fashion! A planet might as well leave its orbit’ (Doyle 1967, 2:43-1). 
In “The Greek Interpreter,” Watson himself notes that altei tea on a 
summer evening,. . . the conversation ... roamed ,. from golf clubs 
t« the causes of the change in the obliquity of the ecliptic .(Doyle 
1967, 1 :f>90). Furthermore, how is it that a mao w ho thought his main 
was like an attic and should not be cluttered with irrelevant informa¬ 
tion could be caught comparing Hah/, and Horace or quoting a f ei- 

sian proverb (“A Case of Identity )••' 

Next we have one of the most original thinkers in economics te- 
ferring to Holmes. But apparently even when great minds meet 
something gets lost. Kenneth Arrow, in his presidential addiess to the 
American Economic Association, sa.d: “Sherlock Hein*. once ma n- 
tained to the dimwiued local police inspector so typical o mgUs 
detective stories that the significant question m the case « 

(he dog's barking „ nighl. Bn./ said .W *?<**•• 

lark • Thai.' Mill Holmes, is whal is sigmtiiant (Al "’" 1 

Readers will immediately reeognme lhal die retereiiee ■' '■> ■'£ 
change occurring in "Silver Blare." one ol .he hues, advemmes m 


ded bv Watson is as follows. Inspei 


tot 


saga. The incident as recon 
Gregory, who is in charge of the case, asks Holm 

* In the sjtfca there are two i worded Holmesmourned 

ture. They arc no. witnessed by Waoon Rllua , 

them. The adventures aie “The (dona Scott amt X 
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“Is there any other point to which you would wish to draw 
my attention?” 

“To the curious incident of the dog in the night-time.” 
“The dog did nothing in the night-time.” 

“That was the curious incident,” remarked Sherlock 
Holmes. [Doyle 1967, 2:277] 

The curious incident, however, is Arrow’s evaluation of Inspector 
Gregory as dim-witted. Watson describes Gregory as “a tall fair man 
with . . . curiously penetrating light blue eyes . . . who was rapidly 
making his name in the English detective service” (Doyle 1967, 
2:269). And here are the remarks of Holmes himself about the in¬ 
spector: “ ‘Inspector Gregory, to whom the case has been committed, 
is an extremely competent officer. Were he but gifted with imagina¬ 
tion he might rise to great heights in his profession’” (Doyle 1967, 
2:268). Perhaps Arrow wants to imply that those who are not gifted 
with imagination and do not rise to great heights in their profession 
are dim-witted. 

Leaving the Olympian heights of the Nobel laureates, we have Ed¬ 
ward Learner christening a particular type of specification search af¬ 
ter the great detective. In his thought-provoking book on specifica¬ 
tion analysis, Learner writes: “The last search is what I have called 
postdata model construction. I also like to call it ‘Sherlock Holmes 
inference.' Sherlock solves the case by weaving together all the bits of 
evidence into a plausible story” (Learner 1978, p. 11). His source is a 
statement made by Holmes to Watson in A Study in Scarlet (again the 
first recorded adv enture). Learner continues: “in response to a ques¬ 
tion from Dr. Watson concerning the likely perpetrators of the crime, 
Holmes replied, ‘No data yet. ... It is a capital mistake to theorize 
before you have all the evidence. It biases the judgement.’ ’’ As a 
matter of fact there was no question from Watson. On their way to 
Number .8 Lauriston Gardens, where the body of Enoch J. Drebber of 
Cleveland, Ohio, and Salt Lake City, Utah, had been found, Dr. Wat¬ 
son interrupts Holmes’s discourse on “Cremona fiddles, and the dif¬ 
ference between a Stradivarius and an Amati" to remark, “ ‘You don’t 
seem to give much thought to the matter in hand’ ” (Doyle 1967, 
1:166); whereupon Holmes gives the response quoted above. 

Learner’s is the most unfair of all attributions. He implies that 
Holmes simply provided an after-the-fact rationalization of the 
events. On the contrary, the great detective subscribed to the positiv¬ 
istic methodology of another great economist, Milton Friedman 
(1958), even before the latter had expounded it. In case after case 
Holmes collected data, formed a hypothesis about the identity and the 
motive of the culprit, and tested his hypothesis, not on the inside 
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sample observations, but by predicting the future move ot his suspeit 
or the likely hiding place of a missing object. “The Adventure of the 
Speckled Band," “The Red-headed League,” “The Adventure of the 
Six Napoleons,” "The Disappearance of Lady Frances Carfax,” and 
“The Adventure of the Creeping Man" are only a sample of such 
cases. 

What Holmes pointed out on the problem of premature theoii/.ing 
resembles Sir Dennis Robertson’s lack of attention to Kondratieff’s 
long cycle because there were not enough data points to enable a 
meaningful study of such phenomena (Presley 1979, p. 18). 


III. Concluding Remarks 

In this note I have shown that economists referring to the exploits ol 
Sherlock Holmes have done so on the basis of a cursory reading of the 
Sacred Writings. Thus they have reflected the facts inaccurately and 
occasionally have even been less than kind to the greatest detective of 
all time. 1 have made an attempt to rectify this state of affairs. 

Kamran M. Daukham 


Northeastern Untve?sity 
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